based board acts as a software development platform allowing designers to ... As the market matures and as and price points merit, the design can later be ...
White Paper
Beyond Physical: Solving High-end FPGA Design Challenges October 2009
Author Angela Sutton Staff Product Marketing Manager, Synplicity Business Group, Synopsys, Inc.
The advantages of using programmable logic to get electronic products to market quickly with less risk and cost are well known and recent market drivers have shifted even further in their favor; new economic realities coupled with changing consumer behavior, shorter product life cycles, richer feature sets, and faster upgrades, to name a few. In step with these demands, high-end FPGAs are now architected using geometries down to 40nm and with capacities of up to five million equivalent ASIC gates. They include performance optimized I/O’s and dedicated DSP architectures that together enable extremely powerful and cost-effective solutions. For these reasons, FPGAs are also widely used to realistically prototype and validate ASIC designs at orders of magnitude higher speeds than are possible with traditional acceleration or emulation based solutions. These FPGA-based ASIC prototypes allow you to tune and debug your design, and just as importantly, act as vehicles for early system software development while the final ASIC design is underway. Perhaps the only black lining to this silver cloud is the question: for today’s large and complex FPGAs are the very advantages that make them compelling – time to market (TTM), fast re-spins, and risk/cost reduction –in jeopardy? This paper examines the latest trends, tools and methodologies that you should consider as you begin your next project. Being aware of the issues and solutions will allow you to take full advantage of the vital resources and benefits offered by FPGAs and to navigate potential hurdles.
New Design Challenges Demand New Design Methods FPGAs have long been used in production communication and data processing systems. Today we see large increases in FPGA usage amongst the ASIC/SoC and system design community. Indeed, over 90% of today’s ASIC designs are first prototyped and validated using one or more FPGAs before committing the design to ASIC silicon. Why Prototype an ASIC/SoC using an FPGA system? Designers can actively check, tune and functionally verify digital systems destined for an ASIC using an FPGA hardware platform. An FPGAbased board acts as a software development platform allowing designers to jumpstart system software development and porting long before the ASIC is available. This parallel ASIC/software development approach allows the final working system to be brought to market months sooner than traditional approaches. The FPGA-based prototype is also unique in the level of performance it can deliver and thus the level of feedback on the ASIC design that can be provided. For example, whereas booting an operating system (OS) on an FPGA-based prototype will take a matter of seconds giving you instant design feedback, booting that same OS on a traditional acceleration based solution or on an emulator would take several hours.
Why FPGAs for Production Systems? FPGAs are a clear choice for many mid- to lower-volume designs where ASICs are price prohibitive and for applications where rapid changes and adapting to standards changes are an advantage (e.g., broadcast base stations, internet and telecommunications switches). FPGAs continue to be a favorite in traditional communications applications, while lower price points and higher performance and capacities have allowed usage to diversify and expand into the consumer, military/aerospace, medical, industrial, and automotive arenas. Why FPGAs for “early production”? Companies that seek to bring electronic products to market very quickly in order to stay competitive or to deliver a proof of concept with a low upfront investment are increasingly opting to use FPGAs for the first version of a new product, with a view to shifting to ASICs if and when volume ramps up. As the market matures and as and price points merit, the design can later be migrated from FPGA to a cost reduced ASIC. One thing that all these applications share in common is increasing FPGA design complexity. Challenges include: `` Prolonged design cycles with longer design iterations and runtimes because the designs are so large and you may well be pushing the performance envelope `` System design complexity (use of DSP blocks, embedded cores and use of IP blocks) `` The need to reduce power & part cost `` Traditional debug and verification takes too long and fails to provide the required coverage This paper will look at each of these issues and potential solutions in following sections. A typical FPGA design flow involves starting with RTL, possibly simulating the RTL for functional correctness, synthesizing down to gates or placed gates, place & route (P&R), and then programming the FPGA on the board. The board-level implementation is then verified and observed for performance and functional correctness. Various tools may be used at this point to debug and trace problems back to the source RTL where fixes are made and the cycle begins anew. Of course, the larger the FPGA the less predictable timing performance after synthesis becomes, and any single iteration made to incorporate a small design modification or in an attempt to improve performance could require half a day to take from RTL to programmed FPGA on the board; it is common to have 20 to 40 iterations to complete a 5M+ ASIC equivalent gate capacity FPGA design. Your design strategy and tools should address other important aspects up front…all the way from RTL code analysis, through embedded design support, functional verification integration, DSP optimization, physical synthesis for early timing predictability and results stability from one run to the next, incremental features that let you save runtime and preserve working parts of the design, integration with FPGA vendors’ P&R systems, debug with visibility enhancement technology, and power analysis features. The key is that the tools should give you the flexibility, visibility and power to control, predict and converge quickly on a working design thus cutting down on unnecessary iterations.
Getting Down to Basics: Need for Speed How can you most effectively achieve your performance goals in the shortest possible time?
Managing constraints First you want to communicate to your design tool some information about your design goals. “Good constraints” can be as important as the RTL itself – they tell the FPGA design tool where to work hard and focus its effort and what timing information to report back to you for analysis. Be aware of the importance of telling the tool about the existence of multi-cycle and false paths so that the tool does not waste time optimizing and reporting a path that it believes to be critical, but that in fact, is not.
Beyond Physical: Solving High-end FPGA Design Challenges
2
Setting complete and realistic constraints is key to design success. Figure 1 shows the effect on performance obtained from a design as the clock constraint is gradually increased. You can see that a certain baseline frequency is obtained by default, after which you enter a region where the constraint starts to make a difference. You see the actual clock frequency tracks the required clock frequency until it reaches an optimal value for that RTL. What may seem surprising is that over-constraining a design (e.g. setting a 150MHz goal when you only need 100MHz) can degrade your results compared with what they would have been had you set a constraint that was just slightly more than 100MHz). Typically, it is recommended that in a logic synthesis tool you do not over-constrain by more than 15% (e.g. do not set a constraint of more than 115MHz if your goal is to get 100MHz in performance). In a physical synthesis tool where the tool can more accurately predict timing during optimization, do not over-constrain by more than 5%. Optimum Over
Fmax result
Under
Constraint Synthesis estimate Post P&R
Clock constraint increasing –– >
Figure 1: Using realistic constraints helps you get the required performance while optimizing logic utilization for reduced costs.
Most designers wish to meet timing using the least possible chip area. This increases the chance of being able to use a smaller and significantly cheaper FPGA (with the added advantage of potentially lower power consumption and cheaper packaging and board options). Figure 2 shows the part pricing impact of using a smaller part size. Implementation tools that do a good job at optimizing the design to meet timing can in some cases enable you to implement the design in a slower speed grade FPGA part, saving additional money.
Timing QoR One slower speed grade
$320/chip*
$80,000 savings for 1,000 parts
$301/chip*
$99,000 savings for 1,000 parts
XC5VLX30T -1FFG323CS1
$400/chip
Area QoR
One smaller part size
Typically 20% premium for 15% performance in FPGAs
*Source: Avnet website, quantity 1
Figure 2: You can extract significant cost savings by optimizing successfully both for timing and area QoR.
Beyond Physical: Solving High-end FPGA Design Challenges
3
Faster debug turns and results stability matter Results from a survey of FPGA users performed by Synopsys in October 2008 point to a tidal wave of longer runtimes from RTL to programmed FPGA. Where such design iterations may have averaged a little over three hours each in the past, when designers push the limits of the FPGA (either by selecting a very large FPGA and filling it to capacity and/or having to meet very stringent performance goals ), then synthesis and especially P&R times increase dramatically - to a total of well over 24 hours per iteration in some cases! The survey data indicates that an increasing number of high-capacity FPGA designs are taking up to six months or more to complete. What can be done to win back some of the time to market benefits of FPGAs? Consider runtime and whether your tools offer incremental design flow options when deciding upon your toolset and methodology. Some Synthesis tools provide the option of a significantly faster runtime, trading off some performance optimizations – giving you slightly worse performance but 20-60% faster runtimes. In the early stages of design development where you are perhaps seeking to devise and tune your RTL and constraints and get basic feedback on your design via a working board prototype, fast iterations to the board are key. With the majority of the design tuned and debugged, you might later seek to improve design performance in critical parts of the circuit by synthesizing while applying the option to optimize for performance (while accepting a longer runtime). Thus, it is important in high-end FPGA design for the tool to offer a “fast” synthesis mode. Using Synopsys’ Synplify Premier FPGA implementation and debug tool in a Xilinx flow, for example, you can get around 60% faster synthesis turn- around-time (TAT) than normal synthesis (RTL to netlist) and 20% faster iterations (RTL to board). Consider other options available to improve debug turnaround time – These might include incremental synthesis and debug flows, multiprocessing, top-down/bottom-up design options, as well as broad language support. Examples of fast debug turnaround features include:
Incremental synthesis flows `` Synplify Premier and Xilinx ISE guided incremental P&R flow are great for Xilinx designs where you have made a small RTL change that is not in a timing-critical part of the design. This flow attempts to make the smallest incremental change possible to the placement and routing of the design. yy Synplify Pro block based flows allow you to work on one specific portion of the design while locking down the rest: for example, the Synplify Pro, Xilinx Guided flow. In Synplify Premier, you can apply multiprocessing which allows you to synthesize each block in parallel on a separate processor, speeding up your runtimes further
Features that make debug in the context of your incremental design changes much easier, such as `` Incremental static timing report generation, typically a synthesis tool feature that allows you to update exception constraints such as multi-cycle paths or false paths and see results reflected in timing reports without re-running synthesis. You can also generate a new incremental netlist/ constraints file to forward annotate to P&R without re-running synthesis. yy The assurance that netnames will not change greatly from one run to the next. This synthesis tool feature allows you to more easily comprehend and debug your design from one run to the next.
Features that address system design and system debug `` Comprehensive SystemVerilog language support is critical to allowing you to specify and debug system and bus-level architectures
Physical Synthesis An FPGA is comprised of resources such as memories, look up tables (LUTs), multipliers and DSP elements in fixed locations, and fixed routing resources. High timing variations can occur in an FPGA depending on where critical paths are placed and which particular resources are chosen during both synthesis (where a resource
Beyond Physical: Solving High-end FPGA Design Challenges
4
type is chosen) and placement (where the location of that resource is chosen). Without physical information your timing can be unpredictable. True physical synthesis understands the routing structure timing of the target FPGA and uses this information during simultaneous logic optimization and placement to achieve a high level of timing correlation between estimates and final post-P&R results. Reaching an aggressive clock frequency in a large, highly utilized FPGA is very time consuming because your resource choices, both cell primitives resources and routing resources, are limited. Good timing predictability during synthesis (a.k.a. tight timing correlation) helps to significantly minimize design iterations. The root cause of the timing predictability problem is well understood: path delays are dominated by interconnect delays. As figure 3 shows, gone are the days when logic elements (LEs) governed the delay in most paths. Instead, 85-90% of the path delay is actually contributed by interconnects between the LEs. This gives rise to two problems: 1. Synthesis needs interconnect delays to make an accurate timing model of the device to create optimal
logic, but since interconnect is not known until P&R is run synthesis can only estimate the interconnect delay. Now that interconnect accounts for almost the whole path, the estimates need to correlate as closely as possible with the final P&R timing. 2. Delay estimation models are error-prone. With routing resources limited, and with local and global
routing resources available to choose from on chip, wire delays are unrelated to the proximity between the beginning and end point of the wire. To illustrate, the middle of figure 3 shows a close up of a generic FPGA with logic cells grouped together. You don’t see the routing but if you consider one LUT driving a flip-flop, it is possible that it might drive a local flip-flop inside the same LE or another flip-flop located further away. Some flip-flops can be reached via fast local interconnect, yet others will be reachable via segmented slower interconnect. Proximity between beginning and end points of the wire do not dictate which routing resource will be used (and thus the wire delay)
Interconnect delays dominate path timing
+
Interconnect delays are unpredictable before P&R
90
=
All path delays are unpredictable Timing models break-down
[2]
Logic cell boundary % of Path
Interconnect delay
1
.35
90n
1
S D[0] E
Q[0] R
LUT
LUT and FF delay
.65
[2]
Q[0] R
This delay could be long
Register
[2]
10
S D[0] E
1
S D[0] E
Q[0] R
[3]
[2] 1
S D[0] E
Q[0] R
This delay could be short
40n
CMOS Feature
Figure 3: Because timing models break down in the face of unpredictable path delays, fast timing closure requires that synthesis and P&R work together to meet the goals built into the timing constraints.
Without accurate interconnect information, path delays become unpredictable -- and you, as the designer, could spend your time working on paths which may actually meet timing after P&R and ignoring paths which will not. Consider the option of running simultaneous logic synthesis and placement – Physical Synthesis. This provides you with more accurate and predictable timing earlier in the design process.
Beyond Physical: Solving High-end FPGA Design Challenges
5
Stratixlll Timing Correlation Synplify Pro and Synplify Premier v9.4 Engineering Suite (305 designs) July 2008
Optimum 70
Number of designs
60 50 Synplify Pro
40
Synplify Premier
30 20 10 0
-40
-30
-20
-10
0 10 % Correlation
20
30
40
Figure 4: Good timing correlation depends on an understanding of placement. The accurate timing correlation delivered by Synplify Premier graph-based physical synthesis means faster turnaround time (in addition to fewer iterations, giving turn time per iteration).
It is key is to consider physical information during the synthesis process. Knowledge of placement during RTL synthesis and of the available routing resources on your chosen FPGA device allows for more accurate path delay prediction and also gives the synthesis tool the ability to work on and optimize the true critical paths. Figure 4 illustrates the timing correlation advantage that routing-aware physical synthesis (also widely known as graph-based physical synthesis) has over traditional logic synthesis. To ensure that timing is preserved all the way down to the FPGA implementation on the board, this technology (built into the Synplify Premier tool from Synopsys) relies on forward-annotation of legal and enforced placement constraints to the P&R tool. The blue line shows that logic synthesis alone gives a broader spread with many critical paths having a margin of error of plus or minus 20%. Graph-based physical synthesis (in purple) in which both the design netlist and its placement are simultaneously generated and optimized, and then preserved, gives a much tighter correlation. The majority of the paths are estimated within a few percent of the final results. This is enough to make the synthesis work hard on the true critical paths of the design. It also means that timing analysis can be performed with a high degree of confidence after physical synthesis (on the placed design) and before P&R, thus providing earlier feedback on your design. This can be of great benefit on large, complex designs when the P&R times are really long.
Managing Design Size and Complexity The other big trend in FPGA design is an increasing amount of DSP content, as well as the use of embedded cores and IP blocks. DSP use has become commonplace in wireless, medical, industrial, video, security, and military/aerospace applications. These big users of FPGAs need an efficient way to first capture and validate DSP algorithms in a high-level environment and then to get the design into hardware such as an FPGA. Altera Stratix-III, Stratix-IV and Xilinx Virtex-5 and Virtex-6 as well as other FPGA vendors all have dedicated DSP resources built into their device architectures. For example, Altera’s Stratix-IV devices have up to 1,288 18x18 multipliers which enable math-intensive DSP algorithms to be efficiently implemented in an FPGA. What’s needed to serve these DSP-hungry applications is a highly efficient and productive method for the DSP designer to specify and validate their algorithm at a high level and then quickly generate RTL that can be implemented in hardware. One frequent problem is that the DSP algorithm designer may include specifications that are not cost effective or even possible to be implemented in hardware. This results in numerous (and avoidable) iterations between the DSP algorithm designer and RTL coder.
Beyond Physical: Solving High-end FPGA Design Challenges
6
For better results in DSP designs, consider the option of running DSP specific synthesis– particularly useful if you model your designs in Simulink/MatLab or wish to use high level DSP building blocks In fact, the DSP algorithm designer should have the ability to capture the intended behavior without having to specify the architecture needed for a particular implementation and to pass the requirements to a hardware designer who can actually make area and performance tradeoffs with DSP synthesis from that single algorithm specification. The main difference here is the design team now has the capability to automatically generate device-tuned RTL code that can be synthesized into an FPGA (or ASIC) as needed. An example of one such flow (figure 5) uses Synopsys Synphony HLS High Level Algorithmic synthesis tool, which starts with a model-based design approach utilizing a very familiar environment such as Simulink® from The MathWorks® along with a language-based approach for control. RTL code may also be used in conjunction with Simulink and MATLAB for specification. Simulink is suitable for capturing parallelism natural to hardware and also supports multi-rate operations. A library (Blockset) containing common DSP functions such as FIR, FFT, Viterbi Decoder, and general math operations is included. This enables fast algorithm capture and validation with Simulink tools such as the Filter Design tool. The power behind this approach comes with the Synphony HLS synthesis engine that performs high-level architectural optimizations from a single specification. This produces RTL that is tuned for the target device. For example, it knows how fast the DSP48 building blocks on a Xilinx Virtex-5 device are, and can perform important system-wide optimizations such as retiming/pipelining, multi-rate folding and multi-channel optimizations (see Sidebar: DSP Optimization Techniques Used in FPGAs). The resulting RTL code is then used with a synthesis tool like Synopsys’ Synplify Pro or Synplify Premier for implementation in an FPGA (or alternatively, with a tool like Synopsys Design Compiler for implementation in an ASIC).
Synphony HLS IP Library
Optimization constraints
Algorithm modeling and verification in Simulink and MATLAB
Synphony HLS Architectural transformation and optimization
Synplify Pro Synplify Premier FPGA RTL Implementation Flow
FPGA
Verification
ASIC RTL Implementation Flow
ASIC
Figure 5: The power of the Synphony HLS approach used with Synplify Premier Synthesis is that it enables high-level architectural optimizations from a single specification. This produces RTL code that is target-device aware, enhancing productivity and design success.
Beyond Physical: Solving High-end FPGA Design Challenges
7
DSP Optimization Techniques Used in FPGAs For successful DSP implementation in FPGAs, a high
Multi-channelization
level approach and architectural optimizations that
Resource sharing applied to a multi-channel datapath
have prior knowledge of target technology timing and
where the datapaths are completely independent. The
backend flow have become critical. The following are
tradeoffs are similar to folding – to reduce area – but
four important DSP optimization methods, along with
another way to look at it is to add channels with a
the benefits and tradeoffs of each.
minimized impact to area. The cost is increased clock rate
Pipelining
requirement and a small amount of initial logic overhead.
The goal of pipelining/retiming is to achieve higher
Multi-rate Folding
clock speeds by inserting or moving registers within the
When applying folding to a multi-rate design, you can
datapath. The cost is, of course, more registers resulting
sometimes get something for nothing. By using the
in more area.
fastest clock to apply resource sharing in the slower
Folding
clock domains, you can reduce the overall area. There
This technique is also known as scheduling or resource
really is no cost as long as the savings overcomes the
sharing. The goal here is to reduce area by sharing
control logic overhead.
resources using higher speed clocks and scheduling
Note: Architecture choice implies significant tradeoffs
operations. The tradeoff is meeting timing for a higher
in cost/performance, and that rapid design exploration
clock rate and the overhead of extra control & mux logic
from a single model yields significant benefits in final
to do the scheduling.
results. Finally, when taking on DSP implementations,
The next two are really special cases of resource sharing
it is important to confirm that the tools provide a
applied to specific types of designs.
seamless, direct flow to FPGA and ASIC targets and prototyping platforms.
The growing use of embedded cores and IP, on-chip microprocessors and other IP blocks introduce complexity and additional special requirements to FPGA designs. High-end FPGA designers also need tools to simplify the building of IP blocks such as memories in HDL. Standard features include a memory compiler and FIFO generator to create technology-independent RTL for inferred memories. Support for single/dual-port memories, single/multiple clocks, and testbenches should be provided for each RAM model.
Addressing Low Power Consumption Demands Power consumption in FPGAs is a growing and immediate concern. Power-consumption impacts not just consumer applications where an extra fan or a more expensive package drives up BOM cost, but also impacts system reliability. Consider the need for accurate power consumption information early in the design cycle. FPGA vendors like Altera address the problem in its Quartus design environment by implementing programmable power technology that collects high-speed paths together into Logic Array Blocks (LAB). The tool can then put some LABs into the normal high-speed mode (high power consumption), some into low-power mode, and some can be turned off, so there’s lower static power consumption. Besides software solutions, all FPGA vendors including Actel, Lattice Semiconductor, Xilinx, and others are attacking the problem via innovative hardware and architectural changes. While big, higher-end FPGAs are not the likely choice for some battery-powered applications, power is of concern to FPGA users. Area optimizations and better utilization of local routing resources can help reduce power consumption. By effectively considering switching activity data for example, design tools from EDA industry leaders like Synopsys are at the forefront of the revolution to lower overall power consumption. Power optimizations and estimates rely on access to good relative statistical signal activity data, which can be used to decide which signals should be assigned short wires and thus consume less power.
Beyond Physical: Solving High-end FPGA Design Challenges
8
In FPGA applications, the Synplify Premier tool can generate SAIF-format switching activity data during logic synthesis; the tool generates an extra file after logic synthesis that drives P&R power reduction and that allows you to perform pre-P&R power estimation. Having the synthesis tool generate this data saves time because you no longer have to create simulation test benches in order to get activity data from your simulation tool. Moreover, this flow lets you get early power estimates before running place and route allowing you to better tune your design. The SAIF activity analysis data generated comprises near simulation quality data with clock frequency inputs and optional boundary specification of state probability and activity rates. What is unique about the Synplify Premier approach to generating switching activity data is that it uses sophisticated formal analysis of sequential circuits to predict activity. This technique is in fact more effective than traditional vector-less analysis approaches which do not differentiate activity levels within clock domains and therefore cannot be used to guide optimization.
Debug in RTL, Not Gates The size and complexity of FPGAs have made verification very problematic, similar in many ways to the problems facing ASIC verification. Classic FPGA debug methods are breaking down for several reasons: `` Iterations in the lab are extremely time-consuming `` Real-time stimuli is needed for applications like networking and video/audio processors `` Using probes and making educated guesses does not work due to 1MB+ gate complexity `` Using gate-level tools to manually “hack” the post-P&R netlist is prone to false errors Consider the need to debug your design on the board, and relate information back to the RT Level, where design issues can be corrected. An innovative approach that is proving successful utilizes “embedded HDL analyzers” that provide debug access at the RT level similar to an RTL simulator, instead of only at the gate level like a logic analyzer. Unique to the Synopsys Identify® tool, this methodology allows users to instrument and debug directly in the RTL with which they are familiar rather than trying to debug post-synthesis gates with names that are not familiar. Here, instrumenting the design is performed pre-synthesis, while capturing can be done anytime the design is active. The benefit? By selecting the observed signals at the HDL level and adding instrumentation directly in your VHDL or Verilog, this approach guarantees the functionality of the watched signal. Here too, it is important to have an incremental debugging flow which allows you to instrument signals that can be functionally guaranteed to match the functionality of the signal in the original HDL. By significantly shortening the time for iterative changes to the instrumentation set, the Identify tool’s incremental feature can effectively increase the design visibility for designs with limited resources. The feature is also valuable for shortening the time it takes to track a bug back to its origination. Most designers like the control and full visibility that a software simulator such as Synopsys VCS provides.
Debugging and Visibility Enhancement For large designs with large test benches a software simulator is just not practical since it may take too long to get to a failure point. Consider whether you need to run a design at full hardware speed up and capture conditions that trigger a failure . Ideally, when a problem occurs, you would like to be able to capture the full state of the design (or module), generate a test bench, and then send both over to a simulator such as VCS where debug features allow the designer to analyze the design and the specific steps that led up to the problem.
Beyond Physical: Solving High-end FPGA Design Challenges
9
This is accomplished in the Confirma Identify Pro product with Synopsys’ TotalRecall technology. TotalRecall replicates the design or module in FPGA hardware and inserts a buffer before the replicate design. All of the inputs go into both the original design and replicated design, but the inputs to the replicate are delayed by the buffer. Then when a problem occurs in the original design, the full state of the replicate with register and memory contents is captured and you can debug it at the RT Level. Since the inputs are delayed, the effect is that the replicate now contains the full state of the design some period of time before the problem occurs. The state of the design is then loaded in to the simulator and the test bench is extracted from the stimulus buffer which takes the design from the pre-problem state up to the problem. Since it is in a simulator, you can single-step through the cycles and see exactly what happened. The full state of the design or module is captured well before the trigger event allowing you to capture the condition that led to the problem and understand why it occurred. This includes the ability to capture memory state and logic state. You can convert a sporadic problem seen in the lab into a replay-able, simulation test bench and replay it in your simulator With TotalRecall technology, verification engineers can capture the events leading up to an error and export the information to their favorite simulator for analysis.
Conclusion Technological advances and market dynamics are exacerbating the challenges designers face when tackling today’s high-end FPGAs. New “higher productivity” approaches and tools are clearly needed to help design teams rein in schedule delays and part cost, manage bigger designs with increasing DSP content, minimize power consumption, and ensure fast and accurate verification of completed designs.
Synopsys, Inc. 700 East Middlefield Road Mountain View, CA 94043 www.synopsys.com ©2009 Synopsys, Inc. All rights reserved. Synopsys is a trademark of Synopsys, Inc. in the United States and other countries. A list of Synopsys trademarks is available at http://www.synopsys.com/copyright.html. All other names mentioned herein are trademarks or registered trademarks of their respective owners. 11/09.CE.09-18065.