BIL: A TOOL-CHAIN FOR BITSTREAM REVERSE-ENGINEERING ...

24 downloads 5521 Views 157KB Size Report
email: [email protected], {seffrin|huss}@iss.tu-darmstadt.de. ABSTRACT .... layout in a spatial sense, but is an abstract functional model. However, each ...
BIL: A TOOL-CHAIN FOR BITSTREAM REVERSE-ENGINEERING Florian Benz, André Seffrin, and Sorin A. Huss Integrated Circuits and Systems Lab Technische Universität Darmstadt, Germany email: [email protected], {seffrin|huss}@iss.tu-darmstadt.de ABSTRACT This paper performs an investigation into the security of Xilinx FPGA bitstreams, introducing a tool-chain for reversing bitstreams back to their device-specific netlists. Bitstream reversal is performed by querying a database containing the mapping of bitstream bits to their related configurable FPGA resources and a secondary database describing the FPGA structure. The mapping database is created by applying an algorithm that correlates binary bitstream data with data extracted from a corresponding netlist. The resource database is derived from a textual device description which can be obtained from the Xilinx design flow. The method can successfully reverse certain sections of the bitstream, although complete bitstream reversal remains infeasible for the time being. The presented tool-chain, the Bitfile Interpretation Library (BIL), improves on previous attempts at bitstream reverse engineering. It is made available as open source for further development.

XDLRC

xdlrc_convert

v5data_gen

Device Description

Address Layouts

Device List

Bitfile XDL

v5cfgmap_gen

Tile Configuration Map

bitcorrelate

Configuration Database

1. INTRODUCTION The readback of bitstream data is a possible attack vector for FPGA devices. Some devices offer an optional encryption of their configuration memory, but it has been shown in [1] that the bitstream may still be retrievable by application of side channel attacks. One may argue that the interpretation of the recovered bitstream is nontrivial: The mapping of memory addresses to the resources is unknown since the bitstream format has not been publicly disclosed. In this regard, manufacturers largely rely on the principle of security through obscurity. However, the Xilinx design flow provides an opportunity for gaining information about the bitstream format, as it allows to obtain a netlist of the synthesized design, in the form of an XDL file. In this work, it will be explored in how far bitstreams can be reversed by exploiting this fact, and which obstacles still remain for full bitstream reversal. This work aims to find out whether full bitstream reversal is feasible, which would present a security problem. The presented approach can be obstructed by not making available XDL netlists within the design flow. Unfortunately, this would also have the consequence that various FPGA

Fig. 1. BIL tool-chain for FPGA analysis. Its tools generate all data files necessary for subsequent bitstream reversals out of a XDLRC description and a bitstream including its corresponding XDL design.

design tools from the scientific community, which also use the format, could no longer be applied. A first attempt at reversing bitstreams has been presented in [2]. The associated tool-chain, called Debit, correlates binary configuration data with data from the corresponding netlist, which is accessible within the Xilinx design flow. However, its approach is very basic, and the project has since been discontinued. This paper presents improvements to this approach by exploiting XDLRC files as an additional source of information and presents a new library with extended analysis capabilities, the Bitfile Interpretation Library (BIL) [3]. It currently works with all subtypes of the Xilinx Virtex-5 series. A short overview of the BIL project is given in Figures 1 and 2. In the steps illustrated in Figure 1, the structure

Data files Device List

Device Description

Address Layouts

Tile Configuration Map Configuration Database

Xilinx FPGAs [5]. It has a rich feature set for handling the XDL and XDLRC formats. It is able to extract a configuration memory image from a bitstream. However, it has no means for interpreting this memory image. 3. XDLRC FILES

Bitfile

bit2xml

bit_extract

bit_reverse

Viewable XML

Configuration Memory Image

Recovered XDL

The XDLRC file format is a Xilinx-specific plain-text format for describing the structure and resources of an FPGA device. It is a key resource for reverse-engineering, since XDL files reference the resources defined in XDLRC descriptions. XDLRC files can be created by a program from the Xilinx tool-chain. 3.1. File Format

Fig. 2. BIL Tool-Chain for Bitstream Reversal of the bitstreams belonging to a particular FPGA device is analyzed. From these actions, a configuration database is set up. In the steps illustrated in Figure 2, the tool-chain employs this information for reversing a particular bitstream. The structure of this paper is as follows: Section 2 gives an overview on related work. Section 3 details the Xilinx XDLRC description files and how they can be utilized for bitstream reversal. In Section 4, the components of the toolchain and some practical results are presented. Section 5 concludes the paper. 2. RELATED WORK The Debit [2] project represents the first attempt at automatically reversing bitstreams by employing a correlation approach. However, the mapping obtained using the reference implementation of this method is largely incomplete. The algorithm lacks an exhaustive list of available resources and has no means of determining if it has decoded all resources successfully. Without any description of static resources such as wiring, a netlist cannot be rebuilt from the obtained state of the configurable resources. The FPGA Analysis Tool (FAT) is a framework for lowlevel analysis and verification of FPGA designs [4]. It can determine the configuration memory bit mapping of the FPGA resources by using a simple but time-consuming “brute-force” approach. This is achieved by generating a XDL design with a preferably minimal resource change compared to a reference XDL design, converting this updated design using the Xilinx tools into a bitstream, and comparing the newly generated bitstream to the reference bitstream. RapidSmith is a Java-based FPGA CAD tool for modern

An XDLRC file models the FPGA structure as rectangular grid of tiles. The tile grid is not related to the physical FPGA layout in a spatial sense, but is an abstract functional model. However, each physical site, which denotes a rectangular area enclosed by surrounding wire channels, maps to a unique subset of tiles. Each tile features a grid position, an unique name, a type, and may contain an arbitrary number of entities. A primitive site represents a logical unit that is connected to the tile wires through its input and output pins. Every primitive site is described as an instance of a so-called primitive definition which are included in another XDLRC section. A wire connects to an arbitrary number of wires on the same or other tiles. A programmable interface point (PIP) connects two wires within a tile. The configuration decides whether this connection is active or inactive. Furthermore, XDLRC files contain a list of the mentioned primitive definitions. Each definition has an unique type name and contains a list of input and output pins by which the instantiated primitive definitions are connected to the tile wires. Also, they contain elements which have a primitive-wide unique name and can be configurable either by a set of mutually exclusive options, by an equation (LUT functionality) or by binary vectors (for RAM and ROM contents). An XDLRC file provides full information about static as well as configurable resources for a distinct FPGA. An XDL design targeting that FPGA only gives information on the configurable resources (i.e., PIPs and elements). Unlike the XDLRC files, XDL files contain only active PIPs and those primitive sites that contain elements not set to default state. 3.2. Processing XDLRC Data A difficult aspect when processing XDLRC files is their size. XDLRC files describing the low-end models of the Virtex-5 FPGA series are at least 1 GB in size, reaching up to 13 GB for the largest devices. Due to high memory consumption and long processing times, it is impractical to use them directly. Instead, we decided to convert them into a custom, smaller

format. As a result of these two conversions, the XLDRC can now be reduced to as little as 0.001% of its original size, without any data loss. For example, the 13 GB large XDLRC file for the xc5vlx330t series of devices can be compressed into just 15 MB by our tool-chain.

Disassembly of jpegenc.bit File meta data: Source file: jpegenc.ncd;UserID=0xFFFFFFFF Target device: 5vlx30ff324 Creation date: 2012/02/14 Creation time: 16:43:44

Packet stream:

3.3. Configuration Memory Mapping When correlating XDL data with the corresponding binary configuration data, it is necessary to properly partition the data into a set of pairs containing a configuration data chunk and the corresponding XDL data. This has to be performed in such way that subsets of pairs can be intersected with each other in order to isolate the bits responsible for a specific PIP or element configuration. Furthermore, the size of those pairs has to be as small as possible in order to minimize search space. We decided to partition the data in a per tile style, as tiles are the smallest building blocks of the FPGA. Because tiles of the same type contain the same resources, one can expect that they also exhibit the same mapping of configuration bits and can therefore be intersected safely. Thus, one must know the mapping of configuration data to tiles, i.e., which tile is assigned configuration data from which configuration addresses. Although the description of these so-called frame addresses exhibits some spatial ordering [6], their mapping onto tiles is not documented. Fortunately, this mapping can largely be inferred from the XDLRC data by an automated process. A program has been incorporated into the tool-chain which generates a lookup table from XDLRC data. The table contains the associated address parts and in-frame offsets for every tile. 3.4. Correlation The correlation approach from [2] can be improved with the help of the XDLRC data. By using a lookup table generated as per Section 3.3, the corresponding configuration data chunk can be fetched for tiles of all types. This allows for a correlation of all tiles and not only for tiles of type INT, as performed by Debit. Additionally, the algorithm is no longer forced to obtain all data of available PIPs and element options from the XDL, which will almost never refer to all available resources on the FPGA. Instead, it is taken from the XDLRC file. Thus, after a correlation run the quality of the results can be quantified. As some steps in the correlation algorithm require knowledge about connections and adjacency, this information can be fetched from XDLRC data and filled into specific data structures prior to correlation, resulting in a much faster correlation process. Finally, it is possible to set up the result database from the XDLRC data prior to the correlation. Thus, just the collected bit positions will be filled in during the correlation. This leads to a database structure which is independent from the processed XDL design and therefore offers the opportunity to merge several

Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Packet type Dummy word Dummy word Dummy word Dummy word Dummy word Dummy word Dummy word Dummy word Buswidth pattern Dummy word Dummy word Sync word Type 1 packet Type 1 packet Type 1 packet Type 1 packet Type 1 packet Type 1 packet Type 1 packet Type 1 packet Type 1 packet Type 1 packet Type 1 packet Type 1 packet Type 1 packet

Opcode

Register

Word count

Data 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff

NO_OP REGISTER_WRITE REGISTER_WRITE NO_OP REGISTER_WRITE NO_OP NO_OP REGISTER_WRITE REGISTER_WRITE REGISTER_WRITE REGISTER_WRITE REGISTER_WRITE REGISTER_WRITE

WBSTAR CMD

1 1

0x00000000 NULL

CMD

1

RCRC

TIMER REG19 COR0 COR1 IDCODE CMD

1 1 1 1 1 1

0x00000000 0x00000000 0x00003fe5 0x00000000 0x0286e093 SWITCH

Fig. 3. Start of bitstream disassembly generated by bitview databases generated from the same device. Such an approach may be useful if one single correlation run does not produce sufficient data. 4. BITFILE INTERPRETATION LIBRARY (BIL) 4.1. Overview BIL consists of two command-line toolchains: One for FPGA analysis as presented in Figure 1, and one for bitstream reversal as shown in Figure 2. Prior to the use of the reversal tools, the required data files must be created by executing an analysis for the respective target devices. The analysis tool-chain contains the programs v5data_gen, xdlrc_convert, v5cfgmap_gen, and bitcorrelate. The programs bit2xml, bit_extract, and bit_reverse make up the reversal toolchain. v5data_gen generates basic data describing all 26 Virtex5 devices. This includes address layouts that specify how the configuration memory of a particular device is structured and addressed. In addition, a list containing the device names and IDs is also generated. xdlrc_convert parses a given XDLRC file and transforms it into a custom binary format. Based on the principles presented in Section 3.2, it compresses XDLRC files losslessly to 0.001% of their original size. In addition, this tool is quite efficient: It converts the 13 GB sized XDLRC file for the xc5vlx330t on our test platform (Intel Core 2 Quad CPU at 2.4 GHz, 4 GB RAM) in about 20 minutes, with a peak RAM usage of 63 MB. v5cfgmap_gen is based on the observations presented in subsection 3.3. It generates a lookup table that specifies the configuration memory regions for each tile of a device. As its inputs the compressed XDLRC data and the address layout

of the target device are needed. bitcorrelate applies the improved correlation approach presented in Section 3.4 to a given bitstream and a corresponding XDL design. It generates a database that specifies the mapping of resources to configuration bits for every tile type on a given device. A textual report is also created from this binary database. Thus, one is able to identify which resources configuration bits have been successfully located. bit2xml reads in a given bitstream and writes its contents (i.e., meta data and microcode packets) into an XML file. This resulting file can be viewed as a bitstream disassembly directly into a generic web browser. This allows to visually inspect the microcode inside the bitstream, see Figure 3. bit_extract interprets the microcode of a given bitstream, thereby producing an image of the configuration memory. It handles all three bitstream types (standard, compressed, and debug). Additionally, a text file is generated that lists every frame address in use and its corresponding offset in the image. bit_reverse maps a given bitstream back to an XDL netlist by using the data files produced from the analysis tool chain. All recovered PIPs are listed. 4.2. Quality of Correlation and Reversal When inspecting the correlation results, it becomes apparent that the correlation algorithm from [2] works well with two types of tiles, INT and HCLK tiles. As the INT tiles represent the switchboxes that connect physical sites to global routing, they contain a huge number of PIPs, but no primitive sites. For about 80% of those PIPs, the corresponding configuration memory bit positions can be obtained when correlating a medium-sized design. Note that this number cannot be compared offhand to the numbers Debit gives, as Debit has no knowledge of the total number of available PIPs. On the other hand, the HCLK tiles represent special clock distribution wiring. They are located in the same grid columns as the INT tiles. They contain no primitive sites and only a small amount of PIPs. For about 14% of their PIPs, the bit positions can be obtained. This relatively low number is explained by the fact that it is difficult to produce a test design which uses a larger quantity of those PIPs. In principle, the correlation algorithm appears to work with this tile type. The PIPs of the remaining tile types (between 60–80 types, depending on the particular device model) cannot be successfully processed by the correlation algorithm. While not all of them contain configurable resources, there are about 20 of them which are considered to be essential (i.e., tiles containing common resources such as logic blocks, I/O buffers, BRAMs, DSPs, etc.). When correlating tiles of these types, the result sets containing the particular bit positions are empty. This is a strong hint that the underlying assumptions of the correlation algorithm from [2] do not apply here.

When looking at the Xilinx FPGA editor or the XDLRC data, there is evidence that those tile columns share some bits, and therefore they will have to be processed in a different manner. The correlation algorithm is not limited to PIPs and can be extended in order to obtain also the element configurations of the primitive sites. However, due to the fact that no tiles with primitive sites on them could be successfully correlated, this cannot be tested yet. The full XDL reversal poses no fundamental problems, provided that the prior correlation step delivers all bit positions. As this is not yet true at the moment, the reversal step just prints out the detected active PIPs. No net reconstruction is performed in the presented version, altough it is inherently possible by tracing the wires of recovered PIPs with the help of the static wiring from the XDLRC file. 5. CONCLUSION AND FUTURE WORK This paper explores to which degree bitstream reversal is feasible on Virtex-5 FPGAs and improves on previous attempts. The presented tool is based on the correlation approach from [2], but enhanced by an appropriate usage of Xilinx XDLRC data. Major benefits arise thereof, such as the detailed knowledge of configuration memory partitioning, faster correlation with feedback about correlation quality, and netlist reconstruction. Our tests have shown that the correlation algorithm does not yield good results for all tile types. This is a new insight, since Debit did not perform tile-wise correlation. Finding suitable approaches for reversing all tile types remains an unsolved challenge, so that the claim made in [2] that full bitstream reversal is essentially an easy task must be questioned. 6. REFERENCES [1] A. Moradi, A. Barenghi, T. Kasper, and C. Paar, “On the Vulnerability of FPGA Bitstream Encryption against Power Analysis Attacks: Extracting keys from Xilinx Virtex-II FPGAs,” in ACM Conference on Computer and Communications Security (CCS 2011), 2011, pp. 111–124. [2] J.-B. Note and E. Rannaud, “From the bitstream to the netlist,” in Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays, 2008, pp. 264–264. [3] F. Benz, A. Seffrin, and S. A. Huss, “Bitfile Interpretation Library (BIL),” "https://github.com/florianbenz/bil/", 2012, [Online, June 2012]. [4] K. K˛epa, F. Morgan, K. Ko´sciuszkiewicz, L. Braun, M. Hübner, and J. Becker, “FPGA Analysis Tool: High-Level Flows for Low-Level Design Analysis in Reconfigurable Computing,” in Reconfigurable Computing: Architectures, Tools and Applications, 2009, vol. 5453, pp. 62–73. [5] C. Lavin et al., “RapidSmith,” "http://rapidsmith.sourceforge. net", 2010, [Online, March 2012]. [6] Xilinx, Inc., “Xilinx UG191 v3.18: Virtex-5 FPGA Configuration User Guide,” "www.xilinx.com/support/documentation/ user_guides/ug191.pdf", August 2009, [Online, March 2012].