Design of a Very High Speed Fuzzy Processor by ... - INFN Bologna

19 downloads 1840 Views 172KB Size Report
+39 - 51 - 6305075 Email: [email protected]. Abstract ... schematic entry till the final layout design, in comparison to its VHDL re-design. Introduction. By means ...
Design of a Very High Speed Fuzzy Processor by VHDL Language Alessandro Gabrielli - Enzo Gandolfi - Massimo Masetti Physics Department University of Bologna Viale Berti Pichat 6/2 40127 Bologna Italy Tel. +39 - 51 - 6305075 Email: [email protected] Abstract Who is dealing with hardware architecture designs probably knows that in the past years many methodologies have been proposed and developed in order to increase the design feasibility. In more detail, many Hardware Description Languages (HDL) have become widespread overall for designing digital architecture from ever and ever highest design levels. In this papers the VHDL, as particular HDL is explained in some applications. In fact we present a first release digital fuzzy [1] processor designed by means of the old Cadence Edge package from front-end schematic entry till the final layout design, in comparison to its VHDL re-design.

Introduction By means of these HDL languages, it is possible to classify two main classes of users: those who use Very High Speed HDL (VHDL) and those who use Verilog even if many others HDLs had become available in the past. In fact, the main Computer Aided Design (CAD) packages for designing, for example, Application Specific Integrated Circuits (ASICS) or Field Programmable Gate Arrays (FPGA), have mainly implemented and developed VHDL and Verilog HDLs. This means that the designer has the possibility to use both HDLs altogether or just one of them. Moreover, the traditional schematic entry front-end is available for pre-designed architectures. By means of these opportunities the designer may be able to orient his job to many directions depending both on which design constraints and specifications are to be fulfilled and on future design tunings and/or developments. In general the designer appreciates the possibility to design also mixed VHDL-Verilog architectures since, as will be explained below, these HDLs also suit reconfigurability purposes. In other words, while designing a hardware architecture, the designer can also take into account the possibility to re-implement either the same design or just a part of it for future applications. This is due to the high level of abstraction the HDL languages allow. On the other hand, this paper overall deals with VHDL rather than Verilog also because VHDL is a bit higher level language and suits pretty well education purposes. In fact Verilog has an intrinsic more complex syntax that may give rise to some starting difficulties and, for non specific applications, its use may not worthwhile. Moreover, the CAD packages we have been using so far pay more attention to VHDL in comparison to Verilog, except for describing library components, where Verilog is used to define the component simulation parameters. In fact, it should be noted that all the HDLs can be used for both designing and simulating purposes and, some packages designing and simulating tools may for example use VHDL for the design and Verilog for the simulation phase.

How VHDL abstraction levels affects the global design In figure 1 is presented a chart in which some VHDL abstraction levels lead to the same output result: the design of a particular digital architecture. Of course a designer can chose many other abstraction levels from the highest, completely disconnected to the hardware, up to the lowest that is generally similar to a netlist or a schematic representation. The first case presented in the figure is the behavioral level in which nearly neither hardware constraints nor library links are included; this case is a high level of abstraction that the designer uses if no attention is paid to how the circuit will be scheduled. This abstraction level just describe the behaviour of the architecture that is going to be designed without considering other constraints so that it is very useful for implementing packages, constants and generic parameters for future utilizations in other designs. In fact it is said that this level offers a great reconfigurability freedom. Another abstraction level is the dataflow level that particularly deals with the flow of signals and data from and to the inner blocks; for example in those cases in which the global design is divided into a hierarchical structure (here not presented). Also this is a high level of abstraction and could not take into account any hardware constraints or library links. The lower VHDL abstraction level is named structural since is pretty fixed and does not allow many future easy corrections; it is much more close to a hardware schematic representation.

The structural level of abstraction is generally used when the designer does not want let the future synthesis package to allocate the components depending on its own algorithm. On the other hand one of the reasons the synthesis packages have been developed is right in order to allocate the components of a design. Otherwise this job is to be done by the designer. Nevertheless the last choice is to be made by the designer since a human is supposed to be able to prefer his own component allocations instead of machine ones. The figure reports a generic block chart in which many different abstraction VHDL levels are used altogether with some pre-designed netlist or schematic representations, for the common aim of a global design that has to be further synthesised. All in all, the final global design can be either directly passed to a top level functional simulation or simulated after having been synthesised (schematic simulation taking into account the library component parameters).

Thus, taking into account these considerations, the designer should spend a little time before starting a design, to differentiate what abstraction level is best for each of the inner hierarchical parts a global design can be composed of.

High speed fuzzy processor designed by schematic entry front-end package This section describes, altogether with figure 2, the main features of a high speed digital fuzzy processor that has already been designed, fabricated and successfully tested in 1994 within the Physics Department of Bologna University [2], [3], [4]. This prototype deals with four 7-bit input variables and one 7-bit output one and processes up to 127 fuzzy rules. Besides that it works up to 50 Mega Fuzzy Inferences per Second (MFIPS) by means of a 50 MHz clock synchronisation signal. These features allow a global processing time, computed from the input data set entry to the ready output variable, that ranges from 1 to 2 µs depending on the number of fuzzy rules. Nevertheless this prototype was not provided with fuzzy rule selectors [5], [6] that are going to be used into the new releases. This design was made using Cadence Edge CAD tool and implemented with 1.0 µm European Silicon Structures (ES2) digital libraries. Moreover, since this designing tool did not have VHDL front-end package, the whole prototype was designed by placing, step by step, all the over 2000 standard cells. This job took approximately 8 man-months for the designing and pre-layout simulating phases and 2 man-months for the layout and post-layout simulating ones. This was a prototype of a class of very fast fuzzy processors that is going to be developed in order to find future application into the High Energy Physics Experiments (HEPE) field. This application field in fact requires decision time that ranges from few hundreds of ns to some µs and this is the reason why high speed processors based on fuzzy logic may give a great help. As regards the considerations that arise from this experience it should be noted that since the Fuzzy Processor was supposed to be very fast for HEPE applications, many inner architectures had been designed strictly taking into account the processing time despite of the silicon area. In fact, it is well known that these two parameters play a significant trade-off role for microelectronics hardware design. In more detail, several architectures had been pipelined to reduce the global processing time, some had been paralleled and other ones had been both pipelined and paralleled for the same goal. This job required most

of the designing time and of the designer efforts but it should be pointed out that it was worthwhile since the final results were very positive. Nevertheless, as every silver lining has a cloud, these architectures, once designed for the prototype, had not been ready for further applications since were too specific and not flexible. In other words these architectures have to be re-designed nearly completely in case of future fuzzy chip implementations.

The schematic and layout representations of this prototype are summarised in figure 2 without describing the single smaller blocks since they are not significative for VHDL to schematic entry comparisons. In figure 2 the processor layout is mainly composed of 3 Random Access Memories (RAMs) that store respectively the input fuzzy set membership function shapes, the fuzzy rules and the fuzzy rule consequent parameters. Besides that the layout is composed of the traditional standard cells that include both the control logic and the architectures for carrying out the fuzzy inference process.

High speed fuzzy processor designed by VHDL front-end package Once the new release CAD tool for designing architectures by means of VHDL front-end package has been available, it has been decided to re-design the just finished fuzzy prototype to take some considerations regarding design techniques. So, the first fuzzy processor prototype, once tested and developed, had been compared to the same processor completely re-designed by VHDL language (it is included into the Cadence Design FrameWork II CAD tool). In addition, the single architectures the global design is composed of, have been synthesised by linking the new release of the same ES2 digital libraries we had used during the schematic entry front-end first design. After synthesis phase the two release pre-layout global schematic design have been compared each other in terms of processing time and silicon area. Anyway, before going into details, it is preferred to give emphasis to a couple of crucial points. The previous section focuses on the high difficulty of extending and/or re-implementing pre-designed schematic architectures. It is been written that only in case of complete compatibility this task may be done. Nevertheless, even if the designer use the VHDL front-end package, it happens that some particular architectures are not directly described by VHDL language or, better, can not be directly synthesised by synthesis packages. For giving an example are below reported a few VHDL lines of a VHDL architecture read cycle that we have used to simulate the RAMs previously mentioned; it is significant to argue that the 13 ns output delay time corresponds to the Verilog parameter used within library component behaviour description. In this way the logic and schematic simulation correspond each other. Note in addition that this VHDL description is not for designing purposes. In fact, these cells are not included into standard libraries but are to be considered as megacells; the CAD designing package may provide the designer with some dedicated "macros" in order to customise the megacells depending on the designer

specifications such as number of words, number of bit per words, unidirectional or bi-directional buses, three state or latch outputs and so on. In general, the silicon foundries provide the designer with a software tool dedicated to this generation task. Anyway, from the designer point of view, this tool just creates a schematic and a symbol representations with the related simulation parameters. At this point the designer has to consider these megacells as black box units, preserving their component instantiation during synthesis process. Read_Rule_Memory: PROCESS --- Address and Data Read Cycle variable laddr: integer := 0; begin wait until ME 'event and ME = '1' and ME 'last_value = '0'; bin2int(ADD, laddr); --- Binary to Integer Conversion Procedure addr