Torc: Towards an Open-Source Tool Flow

0 downloads 0 Views 360KB Size Report
T-VPack [8] and RASP [12] do indeed map logic functions into LUTs, perform ..... [3] S. Guccione, D. Levi, and P. Sundararajan, “JBits: Java based interface for ...
Torc: Tools for Open Reconfigurable Computing Neil Steiner,1 Aaron Wood,1 Hamid Shojaei,2 Jacob Couch,3 Peter Athanas,3 and Matthew French1 1

Information Sciences Institute University of Southern California Arlington, Virginia [email protected], [email protected], [email protected]

2

Department of Electrical and Computer Engineering University of Wisconsin-Madison Madison, Wisconsin [email protected]

Abstract—Configurable computing researchers are often sidetracked by tool and infrastructure needs while pursuing unique and novel work, and frequently resort to simplified device models for lack of real architectural data. To address these issues, we present and describe Torc, an open-source C++ infrastructure and tool set for reconfigurable computing, suitable for custom research applications, for CAD tool development, and for architecture exploration. The Torc infrastructure can read, write, and manipulate EDIF, BLIF, and XDL netlists, as well as Xilinx bitstream packets (without however understanding configuration frame internals). The Torc tools include placing and routing for full or partial designs, along with additional capabilities to facilitate design manipulation and analysis. In support of these capabilities, Torc provides exhaustive wiring and logic information for all major Xilinx devices, derived from nonproprietary sources. We believe that Altera architectures and designs could be similarly supported if the necessary data were available, and we have successfully used Torc internally with custom architectures. We present some examples of capabilities built with Torc, including an EDIF obfuscator, an XDL sub-circuit extractor, and a third-party partial reconfiguration tool. Torc is opensource software and is available at http://torc.isi.edu.

I. I NTRODUCTION Modern FPGAs are complex devices, and the designs targeting them are often described in complex file formats. As a result, researchers frequently have to work with simpler files and models, or have to invest significant development effort into parsers, object models, and large routing graphs. Unfortunately, real devices are much messier to work with than simple models suggest, and the relevance of tools built upon such models is proportionately reduced. Work involving large System-on-Chip designs led us to search for tools that could manipulate netlists in comprehensive ways. Specifically of interest was support for commercial netlist formats such as EDIF, XDL, and VQM. The conclusion was that such capabilities were generally unavailable, except to device manufacturers and large EDA companies, and we therefore began to develop our own tools. We expound here on work first announced in [1].

3

Department of Electrical and Computer Engineering Virginia Tech Blacksburg, Virginia

[email protected], [email protected]

One aim is to provide real device data, increasing the relevance of CAD tools that researchers develop. Another aim is to provide a framework for device and design data, allowing researchers to focus on the unique and novel aspects of their work, without being waylaid by infrastructure development. Promising uses for Torc include CAD tool research, device architecture exploration, hardware autonomy research, partial runtime reconfiguration, as well as tools for low-power CAD, radiation mitigation, security / anti-tamper, health monitoring, et cetera. We divide the Torc code base into two parts: Application Programming Interfaces (APIs) that provide access to both device and design data, and CAD tools that process and manipulate design and implementation data. The APIs form the foundational core of Torc: They provide importers, exporters, and object models for EDIF and BLIF netlists, for XDL netlists, and for bitstreams. They also provide device models and data for all Xilinx devices since the Virtex family. Because device manufacturer data is often sensitive and proprietary, we ensure that the data and capabilities underlying Torc are derived from nonproprietary sources. The Torc tools build upon the foundation provided by the APIs: They presently include routing, placing, unpacking, and combinational path extraction, with packing capabilities to follow in the near future. We provide background information in Section II. We then describe the Torc design in Section III, the API in Section IV, and the associated CAD tools in Section V. Applications follow in Section VI, and we conclude in Section VII. II. BACKGROUND For the sake of clarity, we first describe file formats relevant to Torc, and then delve into prior and related work, and an overview of device architecture and terminology. A. File Formats BLIF (Berkeley Logic Interchange Format): Prevalent academic format for synthesized netlists and optimizations.

EDIF (Electronic Design Interchange Format): Industrystandard format for synthesized netlists. NCD (Native Circuit Description): Post-map netlist format used by Xilinx ISE, with optional placement and routing. Convertible to and from XDL. VQM (Verilog Quartus Module): Standard format for synthesized Altera netlists. Strict subset of structural verilog. Not yet supported in Torc. XDL (Xilinx Design Language): Human-readable version of binary NCD format. XDLRC: Architecture description for Xilinx devices. Distinct from XDL except in internal naming. B. Prior Work Torc functionality is based on prior work that spans many years and multiple projects. Torc’s Device Database (DDB) is a direct descendant of ADB [2], originally developed in conjunction with the JBits [3] project, and later to become the foundation of a completely rewritten but unreleased version of JBits. In the JBits rewrite, ADB’s wiring information was augmented with exhaustive logic information, which was extracted from non-proprietary XDLRC data. The resulting functionality was then embedded into an autonomous system that performed its own parsing, mapping, placing, routing, and reconfiguration while continuing to run [4], without any use of the Xilinx ISE tools. Although ADB was extensively based on Xilinx proprietary data, the databases can be built just as effectively from XDLRC data despite some differences in dialect, and the data representations developed to describe device wiring in ADB still scale flawlessly to the largest devices available today. Efforts to use and extend these capabilities have led us to port ADB to C++ and couple it with NOM [5], a commercial netlist object model. NOM provides importers and exporters for EDIF, structural VHDL, and structural Verilog, along with the internal object model necessary to describe and manipulate netlist circuitry. We further added XDL import and export capabilities, and obtained an update for VQM support in NOM. The EDIF and XDL manipulation capabilities were subsequently validated on very large designs. C. Related Work We know of no other integrated API and tool set comparable to Torc, but there are a number of important related tools, some of which we hope to interface with to mutual advantage. In doing so, we believe we can usefully extend and expand the tool flows described on the fpgaCAD site [6]. At the synthesis or post-synthesis level, a number of primarily academic tools work with or target BLIF netlists: Odin II is an open-source Verilog synthesizer [7], nicely coupled with VPR [8]. ABC is a very popular and powerful synthesis and verification tool, supporting a wide range of

input and output formats [9]. Altera Quartus II, though not academic, can also export BLIF. Commercial synthesis tools more commonly target EDIF, because of the need to support a broad range of technologymapped primitives: Two such synthesis engines are Synopsys Synplify and Xilinx XST. A broad range of parsing, processing, and manipulating capabilities for unmapped netlists are available from the BYU EDIF Tools [10], and prior experience shows the standalone EDIF parser [11] to be lightweight and well suited for use in embedded systems. Few academic tools seem to map from generic netlists to physical netlists for real architectures, other than our prior work [4]. T-VPack [8] and RASP [12] do indeed map logic functions into LUTs, perform optimizations, and pack LUTs into configurable logic blocks, but that is a slightly different problem than mapping library primitives into physical primitives on modern FPGAs. At the mapped netlist level, VPR [8] is the de facto place-and-route tool for research, and has been used for modeling in the development of recent Altera architectures [13]. VPR was recently upgraded to more closely model modern architectures [14], attesting to its continued value and popularity in the configurable computing community, but it still lacks routing graphs for real devices. Also at the mapped netlist level, many groups have dabbled with XDL designs or XDLRC device data, but few if any of those efforts have resulted in open toolsets, with the notable exception of RapidSmith [15]. At the bitstream level, JBits [3] was the only officially sanctioned tool for exploration and manipulation, and it proved to be very popular among researchers. Unfortunately, JBits is no longer available, and many research groups have taken it upon themselves to develop their own bitstream capabilities, often running into intellectual property issues as a result. D. Architecture Description Modern FPGAs are composed of tiles of varying size and complexity, arranged in two-dimensional grids. The tiles are subdivided into distinct types, with each type defining specific logic and wiring resources. Tile types vary by device family and manufacturer, but types composed of LUTs and flip-flops invariably predominate the device. Despite this prevalence, device families may contain more than a hundred different tile types, each of which must abut properly with its neighbors when placed in the tile grid. A more detailed discussion of device architectures—and particularly Xilinx architectures—can be found in [2], [4]. We define some terminology: A wire is a portion of an electrical node that is fully bounded within a tile. We uniquely identify a wire in the physical device in conjunction with the tile that contains it, and call this combination a tilewire. In practice, some tilewires are trimmed from the device, generally around internal discontinuities or device

Synplify, XST, ...

EDIF

EDIF Importer

ODIN II, ABC, SIS, ...

BLIF

BLIF Importer

EDIF

BLIF Exporter

BLIF

Generic Netlist

XDL

Unmapper

Packer

Unpacker

Physical Netlist

Placer

DB

DB Reader

BIT

BIT Reader

torc::generic

Mapper

XDL Importer

Router

VPR

EDIF Exporter

DRC

XDL Exporter

Path Extraction

XDL

Timing Model

Device

torc::architecture

Architecture

Bitstream Frames

torc::physical

BIT Writer

BIT

torc::bitstream

Figure 1. Torc block diagram. The four APIs are the Generic Netlist, the Physical Netlist, the Device Architecture database, and the Bitstream Frames interface. The Packer and Timing Model are designated by red dots as still being under development. The Mapper, Unmapper, and Design Rule Checker are not yet scheduled for development.

edges. If a tilewire abuts with one or more tilewires in adjacent tiles, they collectively form a segment. Common segment examples include doubles, hexes, pents, and longs. Segments composed of just one tilewire are called trivial segments. In keeping with graph theory terminology, programmable connections between wires are called arcs, and segments are sometimes called nodes. The public availability of device data for Xilinx architectures greatly exceeds that of Altera architectures, and consequently governs our terminology. We have found this disparity to hold despite our concerted efforts to obtain Altera data. We nonetheless believe that Torc’s netlist and architecture APIs could accommodate Altera designs and devices, and we remain interested in adding that support. III. D ESIGN This section introduces Torc’s design and structure, and discusses the objectives that have guided our design and development.

A. Overview Torc’s major components are depicted in Figure 1. The APIs are shown against a blue background, and are labeled on the right according to their C++ namespaces. The tool sets are shown against a light gray background, and are positioned between the APIs that they depend upon. Blocks with red dots in the upper right corner are not yet complete. Input and output file types are labeled in green. On the left of the diagram are groups of tools that we may interface with, including commercial synthesis engines by way of EDIF files, and academic synthesis and optimization tools by way of BLIF files. We could also supply VPR with routing graphs for real devices. The Torc code makes substantial use of the Standard Template Library [16] and of Boost [17], two elegant and powerful C++ libraries. B. Objectives Torc’s primary purpose is to provide a robust framework for FPGA implementation tools, not supplanting Altera

EdifExample.cpp

Quartus II or Xilinx ISE, but instead providing a usable alternative when special application or research requirements arise. Researchers are free to use, modify, or extend Torc code, and use it in conjunction with or in place of corresponding portions of Quartus II or ISE. Torc development is guided by a number of objectives and principles: • Designed for a ten year shelf life. • Designed for completeness. Code and structures must be able to accommodate exhaustive data. • Optimized for very large devices. Current devices now have in excess of twenty million unique wires. • Optimized for very large designs. Algorithms and structures must perform well for the largest designs. • Designed for embedding. We limit linkages and dependencies, and constrain disk and memory footprints. • Designed for testability. Unit tests are provided for nearly every class. Regression tests are also provided. • Designed for conditional compilation. The Torc APIs are well integrated, but do not depend upon each other. • Designed for simplicity. We keep the interface simple, do what the user expects, and absorb complexity into the API where possible. • Intended for researchers. We assume most users are not C++ experts.

// torc::generic example: manipulating an EDIF file #include "torc/Generic.hpp" #include using namespace std; using namespace torc::generic; int main(int argc, char* argv[]) { // import the EDIF design string inFileName(argv[1]); fstream fileStream(inFileName.c_str()); ObjectFactorySharedPtr factoryPtr(new ObjectFactory()); EdifImporter importer(factoryPtr); importer(fileStream, inFileName); // look up an instance of interest RootSharedPtr rootPtr = importer.getRootPtr(); InstanceSharedPtr instancePtr = rootPtr->findLibrary("work")->findCell("and") ->findView("verilog")->findInstance("oZ0"); // change the INIT property (LUT mask) to XOR PropertySharedPtr initPropertyPtr = instancePtr->getProperty("INIT"); Value xorMask(Value::eValueTypeString, string("6")); initPropertyPtr->setValue(xorMask); // export the EDIF design string outFileName = inFileName + ".out"; fstream edifExport(outFileName.c_str(), ios_base::out); EdifExporter exporter(edifExport); exporter(rootPtr);

IV. API The core functionality of Torc is encapsulated in the four databases depicted in Figure 1: The generic netlist API, the physical netlist API, the device architecture API, and the bitstream API.

return 0; }

Figure 2. Torc EDIF example. This code imports an EDIF design, looks up a LUT instance of interest, changes its mask to that of an XOR gate, and exports the design.

A. Generic Netlist The generic netlist API supports netlists that are not mapped to physical primitives in a target device—though mapping to library primitives may have occurred during synthesis. We support EDIF 2.0.0 [18] Level 0, and particularly the NETLIST view type. We also support BLIF [19] for compatibility with existing research tools and flows. The API consists of roughly 100 classes. It includes importers and exporters, and is built around a common netlist object model. That common model facilitates conversions between EDIF and BLIF, translating truth tables to or from LUT masks or primitive gates. We also note that the ability to represent and manipulate EDIF—a very versatile netlist format—suggests that Torc could support a broad range of other netlist formats if desirable. Figure 2 shows sample code that imports an EDIF design, changes a LUT mask, and exports the design. The design root object can be used to explore or modify any part of the design, including libraries, cells, views, ports, instances, nets, and properties. In addition to circuit creation and manipulation, the API can serve as a base for synthesis or mapping algorithms, and can flatten netlists.

B. Physical Netlist The physical netlist API supports netlists that have been mapped to physical primitives in a target device. Physical netlists may include partial or full placement and routing information, or may be devoid of any such information. There are two reasons why physical netlist capabilities give the user exceptional control over designs: (1) Unlike the ISE place-and-route tool, Torc allows the user to provide explicit heuristics or code, which generate and retain arbitrary routes, or which withhold arbitrary resources or regions of the device. (2) There are no subsequent mapping or transformation steps performed before bitstream generation, so the user is guaranteed that any changes will be retained as applied.1 Comparable assurances are much more elusive at the generic netlist level. The physical netlist API consists of roughly 30 classes. It functions in a manner similar to the generic netlist API, with a design object that can be used to access any module, 1 We caution however that certain settings in the physical netlist have semantic meaning only, with no impact on the bitstream.

instance, net, pin, pip, or configuration setting. Examples demonstrating functionality comparable to that of the generic netlist in Figure 2 are omitted in the interest of space. C. Device Architecture The device architecture API includes exhaustive knowledge of the device wiring and logic, and exposes that information through the API. It also tracks wire and arc usage, to prevent contention with existing nets or resources, and to inform routers of the resources available. Furthermore, it provides the physical and bitstream APIs with tile maps, logic site maps, and usage information. The device database (DDB) is built upon proven methods and representations for very large and irregular devices [2]. While many routing algorithms expect access to a fully expanded routing graph, the size of such a graph quickly becomes prohibitively large for modern FPGAs. DDB instead works with the graph as a collection of unique segment shapes—internally known as compact segments—instantiated throughout the device and dynamically expanded. The resulting memory footprint drops by many orders of magnitude, while still providing very good performance. Consider the Virtex6 XC6VLX760: This device consists of 163,349 tiles of 68 distinct types, arranged in a 379×431 grid. 67,489,766 tilewires in the device (60,150,275 real and 7,339,491 trimmed) compose 23,041,389 unique segments (8,339,132 trivial and 14,702,257 non-trivial). These segments can be internally represented as 30,775 unique segment shapes, instantiated hundreds of times throughout the device. The device also contains 197,039 logic sites of 37 distinct types, whose pin connections are described by 210 unique pin maps. We obtain this information from 22 GB of raw data, and distill it down to a 28 MB database. The device architecture API presently includes exhaustive databases for 140 devices in 11 Xilinx families. D. Bitstream Interface The bitstream API supports reading, modifying, and writing bitstream packets and configuration frames for supported Xilinx architectures. Xilinx publishes a wealth of information about bitstreams, but does not disclose information about configuration frame contents, and Torc consequently treats frames as black-box containers. Unfortunately, no comparable information is available to us for Altera architectures. The bitstream API sets up a mapping between frame indexes and frame addresses, enabling it to correctly interpret both full and partial bitstreams, and to properly overlay frames. It also understands all documented settings in configuration controller registers, and is able to display them in symbolic form, to simplify human-readable packet dumps.

V. T OOLS Torc includes CAD tools to perform unpacking, placing, and routing, with packing still under development. The tools are provided as source code, rather than executables, and can serve as guides for working with the physical netlist and device architecture APIs. They can also provide the foundation for special-purpose heuristic development. Users are free to use, modify, or improve the tools provided, or write their own tools from scratch. A. Packer The term packing is often understood to mean combining logic functions or gates into LUTs, or combining LUTs and flip-flops into simple logic blocks or clusters, mindful of the impact on circuit performance, but with very few physical design rules to constrain such operations. But blocks in real architectures are considerably more complex, with multiple paths to and from internal elements, and constraints governing signals and resources. Packing logic into these blocks is more like a series of little placement and routing problems. Although no mapper is currently scheduled for development, a logic block packer is still useful in that it allows the user to work more naturally with circuitry, in terms of LUTs, flip-flops, and other basic elements. While a regular primitive might be a SLICE, its internal elements would include LUTs, flip-flops, higher-order muxes, a carry chain, and input inverters. The packer’s job is to combine design elements into physical primitives, without violating physical design rules, and to generate the resulting configuration settings. The availability of the packer thus allows the user to (1) describe circuitry in a more architecture-independent manner, and to (2) let the packer satisfy the design rule constraints for the various physical primitives. B. Unpacker An interesting conclusion that emerges is that manipulating the logic inside physical primitives calls for unpacking capability. Consider for example a SLICE with a LUT driving a flip-flop. Suppose further that all of the LUT inputs are in use, and that we need to gate the LUT output with an additional signal. At the physical netlist level, this will necessarily require the use of an extra LUT or mux, which means that the existing LUT will probably have to be removed from the SLICE, so that a new LUT can be inserted into the signal path. Not only will the existing LUT have to be re-placed and re-routed, but its direct and dependent configuration settings will have to be adjusted accordingly. In the more general case this becomes unnecessarily burdensome for the user, and it makes more sense to first unpack the region of interest, modify the circuit as needed, and then incrementally re-pack, re-place, and re-route.

(a) System design.

(b) RS232 sub-circuit.

(c) Extracted RS232 sub-circuit.

Figure 3. XDL sub-circuit extraction example. FPGA Editor views. Figure (a) shows the original design. Figure (b) highlights all nets belonging to the RS232 sub-circuit. And Figure (c) shows the RS232 portion of the original circuit extracted as a separate design.

The placer and router also use unpacking to extract combinational paths and calculate their logic depth. For every synchronous element and every primary output, we traverse the logic cone in the direction of its sources to determine the logic depth on that path. The maximum logic depth on each path is then used by the placer or router to prioritize nets. C. Placer Placing assigns physical primitives in the design to physical locations in the device, for a result that is routable and contains no overlapping logic. The placer uses simulated annealing [20], with distances measured in terms of tile map coordinates. Annealing operations select design instances with equal probability, regardless of their logic type, but we require that the placement always remain in a valid state. This forces the placer to track the available locations for each logic type, and to only swap between locations of compatible type. Placement in real devices introduces additional complexity that academic placers do not have to contend with. In addition to irregularity and heterogeneity in the tile map, real devices may also contain logic blocks that are polymorphous: A Virtex4 SLICEM is a superset of a SLICEL, for example, and a Virtex5 BRAM can be instantiated in nine different ways, and similarly for IOBs. The placer therefore has to manage this extra degree of freedom as it performs its operation.

D. Router Routing finds detailed paths that connect sources to sinks for every net, for a result that meets timing requirements and is free of contention. The Torc router includes a preliminary coarse router, a global router, and an underlying signal router. Timing data for Xilinx devices is unfortunately not available along with XDLRC data, and consequently neither DDB nor the router can accurately guarantee timing. We alleviate this issue by working to reduce delays based on combinational path length, as previously explained. We have also obtained promising results in using Process Design Kits to model wiring delays, but are still working on integrating that capability. The signal router is an A* search [22] that routes a single source tilewire to one or more sink tilewires, in the order provided. This router respects the DDB wire and arc usage information, which allows it to perform incremental routing without obstructing existing routes. The global router routes a collection of nets all at once— typically an entire design. It uses a PathFinder implementation [23] to resolve net contention, and assigns the detailed routing of each net to the signal router. The preliminary router performs coarse analysis on the design, with the help of an Integer Linear Programming (ILP) formulation [24], and generates a set of constraints to guide the global router. The ILP formulation minimizes total wire length, weighted by net criticality, where net criticality is defined

(a) System floorplan.

(b) Blocking route.

(c) Static system.

Figure 4. OpenPR [21] route blocker example. Floorplanner and FPGA Editor views. OpenPR forcibly prevents par from using any routing within the dynamic region, by adding each arc entering the region to a fake net. Figure (a) shows the system floorplan. Figure (b) shows the highlighted fake net. And Figure (c) shows the resulting static system with no connections in the dynamic region. Note that the horizontal net passing through the middle of the dynamic region in Figure (c) has no Programmable Interconnect Points inside the region and is therefore not impacted by reconfiguration.

by the longest combinational path to which the net belongs. This approach prioritizes nets that belong to deep combinational paths, with the intent of minimizing critical path delays in the absence of real timing data. When the ILP solution is obtained, the selected candidate routes are passed along to the global router as heuristic constraints. This preliminary routing stage significantly reduces the amount of contention that the global router must resolve, and in doing so improves runtime performance considerably. We expect to report performance numbers in future work. VI. A PPLICATIONS Torc is suitable for a wide variety of situations, including CAD tool research, architecture exploration, and any application that requires precise fine-grained control. Most of our efforts to date have centered around API and tool development, but we nonetheless mention some capabilities built with Torc or presently being ported to Torc. A. EDIF Obfuscator Developers of Third Party IP sometimes choose to obfuscate hierarchy and design names in their cores. The EDIF obfuscator imports netlists, identifies and protects the interface of the top-level cell, and replaces all other cell, module, port, instance, and net names with the MD5 hash of a sequential counter. The obfuscated and unobfuscated names are written to a log file. For testing purposes, the resulting NCD and PCF files can be deobfuscated after the design has been mapped.

B. XDL Sub-Circuit Extractor In very large designs, it can be convenient to extract and view just a subset of the design, for development or debugging reasons. The sub-circuit extractor provides a way to select and isolate portions of a design, based on design names or other special-purpose filters. Figure 3 shows an RS232 core extracted from a simple EDK design. C. OpenPR OpenPR [21] is a third party tool being ported to Torc. The function of interest here is the ability to rigorously prohibit the ISE router from using certain resources. OpenPR accomplishes this with the help of the device architecture API by creating a fake route that blocks every possible path into a region of interest, such that par cannot interfere with it in any way. Figure 4 shows an example of OpenPR’s route blocking, to keep undesired routing strictly outside a floorplanned area. VII. C ONCLUSION Torc is an open-source C++ infrastructure for reconfigurable computing, suitable for custom research applications, for CAD tool development, and for architecture exploration. Its primary purpose is to promote and facilitate research, by providing a framework for device and design data, allowing researchers to focus on the truly novel and unique aspects of their work. The Torc infrastructure can read, write, and manipulate EDIF, BLIF, and XDL netlists, as well as Xilinx bitstream

packets. The Torc tools allow unpacking, placing, and routing for full or partial designs, along with additional capabilities to facilitate design manipulation and analysis. In support of these capabilities, Torc provides exhaustive wiring and logic information for 140 Xilinx devices in 11 families— Virtex, Virtex-E, Virtex-II, Virtex-II Pro, Virtex4, Virtex5, Virtex6, Virtex6L, Spartan3E, Spartan6, and Spartan6L. We believe that Altera architectures and designs could also be supported, and we have successfully used Torc internally with custom architectures. Torc extends a strong foundation of prior work, and we have provided examples of capabilities built with Torc, including an EDIF obfuscator, and XDL sub-circuit extractor, and a third-party partial reconfiguration tool. Although manufacturer timing data is not presently available to us, initial efforts to derive a timing model from the appropriate Process Design Kit are promising. Torc is open-source software and is available at http://torc.isi.edu. VIII. ACKNOWLEDGMENTS Thanks to our collaborators at Virginia Tech, at Brigham Young University, and at Interra Systems. R EFERENCES [1] N. Steiner, A. Wood, H. Shojaei, J. Couch, P. Athanas, and M. French, “Torc: Towards an Open-Source Tool Flow,” in Proceedings of the 2011 ACM Nineteenth International Symposium on Field-Programmable Gate Arrays, FPGA 2011, (Monterey, California), February 27–March 1, 2011. [2] N. J. Steiner, “A standalone wire database for routing and tracing in Xilinx Virtex, Virtex-E, and Virtex-II FPGAs,” Master’s thesis, Virginia Tech, August 2002.

[9] Berkeley Logic Synthesis and Verification Group, “ABC: A system for sequential synthesis and verification,” http://www. eecs.berkeley.edu/∼alanmi/abc. [10] BYU EDIF Tools Home Page, Brigham Young University, http://reliability.ee.byu.edu/edif. [11] BYUCC Edif Parser, Brigham Young University, http://splish. ee.byu.edu/download/edifparser/edif parser download.html. [12] J. Cong, J. Peck, and Y. Ding, “RASP: A general logic synthesis system for SRAM-based FPGAs,” in Proceedings of the 1996 ACM 4th Annual International Symposium on Field-Programmable Gate Arrays, FPGA 1996 (Monterey, California), February 11–13, 1996, pp. 137–143. [13] D. Lewis, E. Ahmed, G. Baeckler, V. Betz et al., “The Stratix II logic and routing architecture,” in Proceedings of the 2005 ACM/SIGDA 13th Annual International Symposium on Field-Programmable Gate Arrays, FPGA 2005 (Monterey, California), February 20–22, 2005, pp. 14–20. [14] J. Luu, I. Kuon, P. Jamieson, T. Campbell, A. Ye, M. Fang, and J. Rose, “VPR 5.0: FPGA CAD and architecture exploration tools with single-driver routing, heterogeneity and process scaling,” in Proceedings of the 2009 ACM/SIGDA 17th Annual International Symposium on Field-Programmable Gate Arrays, FPGA 2009 (Monterey, California), February 22–24, 2009, pp. 132–142. [15] C. Lavin, M. Padilla, P. Lundrigan, B. Nelson, and B. Hutchings, “Rapid Prototyping Tools for FPGA Designs: RapidSmith,” in Field-Programmable Technology (FPT 2010). International Conference on, December 2010. [16] A. Stepanov and M. Lee, “The standard template library,” HP Laboratories, Tech. Rep. 95-11(R.1), 1995. [17] Boost, http://www.boost.org.

[3] S. Guccione, D. Levi, and P. Sundararajan, “JBits: Java based interface for reconfigurable computing,” in Proceedings of the Second Annual Military and Aerospace Applications of Programmable Devices and Technologies Conference, MAPLD 1999, (Laurel, Maryland), September 28–30, 1999. [4] N. J. Steiner, “Autonomous computing systems,” Ph.D. dissertation, Virginia Tech, March 2008. [5] Interra Systems, Inc., “NOM – Netlist Object Model,” http: //www.interrasystems.com/eda/pdf/NOM Datasheet.pdf. [6] fpgaCAD, http://fpgacad.ece.wisc.edu. [7] P. A. Jamieson, K. B. Kent, F. Gharibian, and L. Shannon, “Odin II: an open-source verilog HDL synthesis tool for FPGA CAD flows,” in Proceedings of the 2010 ACM/SIGDA 18th Annual International Symposium on FieldProgrammable Gate Arrays, FPGA 2010 (Monterey, California), February 21–23, 2010, p. 288. [8] V. Betz and J. Rose, “VPR: A new packing, placement and routing tool for FPGA research,” in Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications, FPL 1997, (London), September 1–3, ser. Lecture Notes in Computer Science, W. Luk, P. Y. K. Cheung, and M. Glesner, Eds., vol. 1304. Springer Verlag, 1997, pp. 213–222.

[18] Electronic Design Interchange Format, Electronic Industries Association, 1988. [19] Berkeley Logic Interchange Format (blif), University of California, Berkeley, 1996. [20] N. Sherwani, Algorithms for VLSI Physical Design Automation, 2nd ed. Boston: Kluwer Academic Publishers, 1995. [21] A. A. Sohanghpurwala, “OpenPR: An Open-Source Partial Reconfiguration Tool-Kit for Xilinx FPGAs,” Master’s thesis, Virginia Tech, December 2010. [22] N. J. Nilsson, Principles of Artificial Intelligence. Palo Alto, California: Tioga Publishing Company, 1980. [23] L. McMurchie and C. Ebeling, “PathFinder: A negotiationbased performance-driven router for FPGAs,” in Proceedings of the 1995 ACM 3rd International Symposium on FieldProgrammable Gate Arrays, FPGA 1995, (Monterey, California), February 12–14, 1995, pp. 111–117. [24] T.-H. Wu, A. Davoodi, and J. T. Linderoth, “GRIP: Scalable 3D global routing using integer programming,” in Proceedings of the 46th Annual Design Automation Conference, DAC 2009 (San Francisco, California), July 26–31, 2009, pp. 320– 325.

Suggest Documents