Research Center for Integrated Microsystems. Department of Electrical and
Computer Engineering. University of Windsor. Soft-Core Processors. For
Embedded ...
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Soft-Core Processors For Embedded Systems Jason G. Tong and Ian D. L. Anderson Supervisor: Dr. M.A.S. Khalid ICM’06, December 2006 Research Center for Integrated Microsystems Department of Electrical and Computer Engineering University of Windsor
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 1
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
OUTLINE • Introduction • A Survey of Soft-Core Processors • Commerical Cores and Tools • Open-source Cores • Some Example Applications • Comparison of Soft-Core Processors • Conclusions and Future Work
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 2
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Introduction
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 3
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Embedded Systems • An embedded system: a system that utilizes custom hardware and software to carry out specific tasks • Digital Hardware: – Microprocessor or µC – Application-specific hardware generally used for accelerating timecritical tasks • Embedded software running on the μP or μC
Embedded System Embedded CPU Software running on CPU
Memory & I/O Applicationspecific hardware
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 4
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Core-Based Design • It makes sense for many designers to reuse pre-designed hardware components called “Intellectual Property (IP) Cores” • Reduce design time at the expense of area/performance penalty • Soft-core processors – Complete microprocessors described in a hardware description language (HDL) such as VHDL, Verilog, etc. 18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 5
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Advantages of Soft-Core Processors • Higher level of abstraction – easier to understand • More flexible – designers can change the core by editing source code or selecting parameters (more on that later) • Platform independent – can be synthesized for any IC technology, including FPGAs, ASICs, etc. – More immune to obsolescence
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 6
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
A Survey of Soft-Core Processors
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 7
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Commercial Cores and Tools • Some of the well known commerical cores are: • Altera’s Nios II • Xilinx’s MicroBlaze and PicoBlaze • Tensilica’s Xtensa
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 8
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Nios II Processor • Nios II – The second generation of 32-bit Reduced Instruction Set Computer (RISC) Processor by Altera • A configurable processor that can be used in many embedded system applications (e.g., Printers, Medical Instrumentation, DVD Players, etc.) • Targeted for multiple Altera FPGAs (e.g., Cyclone Series FPGAs, Stratix Series FPGAs) • Instantiated using Altera’s System On Programmable Chip (SOPC) CAD Tool
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 9
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Nios II Processor •
Nios II Processor (A 32-bit RISC Soft-Core Processor) • 32 General Purpose Registers • 32 Interrupt Request Registers • Separate Instruction and Data Cache (up to 64KBytes) • Single Instruction 32x32 Multiplier and Divider • Dedicated 64-bit and 128-bit multiply instructions • Tightly Coupled Memory (TCM) Modules • Three variant cores: Economy, Standard, Fast
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 10
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Nios II Processor Economy Core – Optimized for Area, but provides sufficient performance for small applications (e.g., TV Remote Control, Digital Clock)
Standard Core – Trade off between Economy and Fast cores. Better performance than Economy can offer. But consumes more LEs (e.g., Microwave Oven Controller, DVD Players) Fast Core – Optimized for best performance on computationally intensive applications. But consumes the most LEs (e.g., multimedia applications) Courtesy of Paramount and Universal Studios, USA California
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 11
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Nios II Processor Comparison Nios II Variants Features
Nios II Economy
Nios II Standard
Nios II Fast
Objective
Optimized for size
Balance between size and Speed
Optimized for High Performance
Caches (Instruction/Data)
None
Up to 64KB / None
64KB / 64KB
Pipeline Stages
1
5
6
Hardware Multiply
Software Emulated
3 Cycles per MUL
1 Cycle per MUL
Logic Elements used
600-700
1200-1400
1400-1800
Custom Instructions
256 Custom Instructions
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 12
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Xilinx MicroBlaze •
•
•
MicroBlaze – a 32-bit soft-core processor optimized for embedded applications Features: • Harvard RISC architecture • 32-bit instructions • 3-stage pipeline • 32 general-purpose registers Image Source: MicroBlaze Product Brief http://www.xilinx.com/bvdocs/ipcenter/product_brief/MB_sell_sheet.pdf • Shift unit • 2 levels of interrupt Has a number of options including FPU, hardware divider, barrel shifter, caches, etc – save area by not using these options if not needed
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 13
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Xilinx MicroBlaze •
• •
Can be connected to on-chip and off-chip peripheral components such using the general-purpose On-chip Peripheral Bus (OPB) Interfaced to memory via the Local Memory Bus (LMB) or the OPB Designers can use the Fast Simplex Link (FSL) to interface custom hardware accelerators directly to the pipeline • FSL - a low-latency interface to the MicroBlaze pipeline
Image Source: MicroBlaze Product Brief http://www.xilinx.com/bvdocs/ipcenter/product_brief/MB_sell_sheet.pdf
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 14
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Xilinx PicoBlaze • • •
A small 8-bit soft-core microcontroller for simple processing applications Specifically optimized for the Virtex and Spartan families of FPGAs, and the CoolRunner-II CPLDs VHDL source-code can be downloaded for free from Xilinx Inc.’s website
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 15
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Tensilica Xtensa • •
•
Tensilica’s flagship product: the Xtensa Series of “configurable and extensible” processors Configurable: base core features a set of parameters: • Hardware multipliers, single precision IEEE-754 compatible FPU, varying number of interrupts, cache sizes and write policies, variable instruction and data memory sizes, and others Extensible: designers invent custom instructions using the Tensilica Instruction Extension (TIE) language – a Verilog-like language • The TIE instructions are turned into hardware using the TIE compiler
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 16
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Xtensa Design Environment •
•
•
The Xtensa Xplorer – a complete design environment for Xtensa processors The XPRES Compiler can analyze a C/C++ algorithm and automatically create TIE extensions The Xtensa Processor Generator generates HDL code and Electronic Design Automation (EDA) scripts for a customized processor Image Source: Xtensa Developer’s Toolkit Product Brief http://www.tensilica.com/pdf/xtensa_processor_v3.pdf
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 17
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Open-Source Cores • There are numerous soft-cores freely available from open-source communities across the internet • Two will be reviewed: • LEON by Gaisler Research • OpenRISC 1200 from opencores.org
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 18
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
LEON by Gaisler Research • The LEON line of synthesizable processors: based on the SPARC Version 8 architecture • LEON2 and LEON3: open-source VHDL models Image Source: LEON2 Processor Overview of 32-bit processing cores http://www.gaisler.com/cms4_5_3/index.php?option=com_content&task=vi ew&id=12&Itemid=52 • Integer unit fully compliant with IEEE-1754 SPARC V8 standard • Hardware multiply, divide and multiply-accumulate (MAC) units • LEON2 – 5-stage pipeline, LEON3 – 7-stage pipeline • Several peripherals available, including FPUs, timers, UARTs, interrupt controllers, etc. 18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 19
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
OpenRISC 1200
Image Source: OpenRisc 1200 http://www.opencores.org/projects.cgi/web/or1k/openrisc_1200
• • •
One of the popular processors at www.opencores.org Contains either a 32/64-bit RISC Architecture and a 5-stage pipelined processor Harvard architecture, containing separate 8KB data and instruction memories
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 20
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
OpenRISC 1200
Image Source: OpenRisc 1200 http://www.opencores.org/projects.cgi/web/or1k/openrisc_1200
• • • •
High performance CPU/DSP architecture – 32-bit architecture implementing ORBIS32 instruction set Advanced Debug Module – non-intrusive debug module for debugging of the OpenRisc processor Tick Timer – System Clock Timer Instruction/Data Memory Management Unit – Harvard Style memory organization, 8KB of instruction/data cache memories
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 21
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
OpenRISC 1200
Image Source: OpenRisc 1200 http://www.opencores.org/projects.cgi/web/or1k/openrisc_1200
• • • • •
Synthesizable and downloadable onto Altera or Xilinx FPGAs Real-Time Operating System (RTOS) Support: Linux, Micro-Linux, and OAR RTEMS Programming Languages Support: C/C++/Java/Fortran Peak Performance: 250MHz and 250DMIPS Additional Features: Floating Point Unit and up to 8 peripherals can be connected to the processor core
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 22
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Some Example Applications
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 23
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Communications - Broadcom • Broadcom used the Xtensa Processor in the BCM1500 Project • Designed a class of access communication components for voice, video and data networking • Five Xtensa Processors were used in their CALISTOTM architecture which manages computationally intensive communications functions: • e.g. echo cancellation
Image Source: CALISTO VoP Product Brief http://www.broadcom.com/collateral/pb/1500VoP-PB03-R.pdf
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 24
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Communications – Cisco Systems • Cisco Systems’ Carrier Routing System (CRS-1) • Uses 192 Xtensa Processors in the Cisco Silicon Packet Processor • CRS-1 is the only carrier routing system capable of scaling up to 92 terabits per second
Image Source: Cisco Carrier Routing System http://www.cisco.com/application/pdf/en/us/guest/products/ps5 763/c1031/cdccont_0900aecd800f8118.pdf
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 25
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Advertising - AED •
•
• •
Advanced Electronic Designs (AED) created an LED sign for JPMorgan ChaseTM in Times Square in New York City 135 feet long and 26 feet high with a resolution of nearly Image Source: Sign of the Times Xilinx Xcell Journal, Winter 2004 2 million pixels Used Virtex and Spartan FPGAs and over 1,000 PicoBlaze processors One main use of processors: Ethernet controllers used to send data to different parts of the sign
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 26
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Security and Authentication - UCLA •
•
•
A team of researchers a UCLA have developed the ThumbPod – an FPGA based fingerprint authentication device The LEON2 processor is used along with two co-processors: • Advanced Encryption Standard (AES) processor • Discrete Fourier Transform (DFT) processor The ThumbPod is able to capture a fingerprint, extract its features and return a score from 0 to 100 indicating the degree of matching
Image Source: ThumbPod Puts Security Under Your Thumb Xilinx Xcell Journal, Winter 2004
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 27
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Comparison of Soft-Core Processors
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 28
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Comparison of Soft-Cores Nios II (Fast Core)
MicroBlaze
Xtensa XL
OpenRISC 1200
LEON3
Speed MHz (ASIC/FPGA)
200 MHz (FPGA)
200 MHz (FPGA)
350MHz (ASIC)
300MHz (ASIC)
125MHz/400MHz (FPGA/ASIC)
FPGA/ASIC Tech.
Stratix/ Stratix II
Virtex-4
0.13 micron
0.18 micron
0.13 micron
Reported DMIPS
150 DMIPs
166 DMIPs
N/A
300DMIPS
85DMIPs
32-bit RISC
32-Bit RISC
32-bit RISC
32 or 64-bit RISC
32-bit RISC
Up to 64KB
Up to 64KB
Up to 32KB (1)
Up to 64KB
Up to 256KB
Floating Point Unit (optional)
IEEE 754
IEEE 754
IEEE 754
as peripheral
IEEE 754
Pipeline
6 Stages
3 Stages
5 Stages
5 Stages
7 Stages
Custom Instructions
Up to 256 Instructions
None
Unlimited
Unspecified limit
None
Register File Size
32
32
32 or 64
32
2-32
Implementation
FPGA
FPGA
FPGA, ASIC
FPGA, ASIC
FPGA/ASIC
Area
700-1800 LEs
1269 LUTs
0.26mm^2
N/A
N/A
ISA Cache Memory (I/D)
(1) – Using 1,2 or 4-way set associative configuration
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 29
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Comparison of Soft-Cores •
Leon3 has the highest operating frequency of 400MHz using ASIC Implementation
•
Nios II and Microblaze, both have the highest operating frequency at 200 MHz on FPGA Implementation
•
Tensilica Xtensa offers the greatest flexibility since designers have the ability to implement unlimited custom instructions and execution units in the processor’s core
•
Majority of the soft-core processors have an optional FPU which is added as a component in the processor’s core or as a peripheral
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 30
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Conclusion and Future Work •
Soft-core processors are becoming an attractive alternative for embedded system design due to the flexibility they offer • More wide-spread usage for embedded system design
Related research at RCIM •
• •
A CAD tool is currently under development which enables designers to build soft-core processors and explore various architecture trade-offs (Ian Anderson & Omar Alryahi) • Next few slides introduce this CAD tool Aws Ismail exploring NoC implementation on FPGAs Jason Tong developed an accurate, non-intrusive, FPGA-based SW profiling tool for NIOS II based embedded systems
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 31
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
SCBuild: A CAD Tool for the DSE of Parameterized Cores
18thAnderson International Conference on Microelectronics M.A.Sc – December 16th – 19th, 2006 Defense Ian
Slide 32 32
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Purpose of SCBuild • “SC” stands for “Soft-core” • A software tool that performs the following tasks: • Facilitates rapid exploration of the design space of a parameterized core using the SEAMO algorithm • Generates variants of a soft-core based on a user-selected set of parameter values • In other words, the user specifies the parameter values and SCBuild generates a VHDL description of it automatically
18thAnderson International Conference on Microelectronics M.A.Sc – December 16th – 19th, 2006 Defense Ian
Slide 33 33
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
SCBuild System Environment • Template Architect: generates a Template Description for a given hardware component • SCBuild takes this description and instantiates VHDL components from the Library to create a softcore • Also performs DSE using SEAMO algorithm
(Not part of this present research)
Template Architect Tool Template Description
SCBuild
Template Component Library
User inputs parameter selection
VHDL Component Library
VHDL Description of Core
18thAnderson International Conference on Microelectronics M.A.Sc – December 16th – 19th, 2006 Defense Ian
Slide 34 34
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
CAD Flow for SCBuild Design Entry Check XML Syntax
Correct?
No
Yes
Collect System-level Parameters DSE and Parameter Selection Elaboration Create and Compile Quartus II Project (Optional)
18thAnderson International Conference on Microelectronics M.A.Sc – December 16th – 19th, 2006 Defense Ian
Slide 35 35
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
References • • • • • • •
[6] Xilinx Incorporated Website, www.xilinx.com, June 2006 [7] “MicroBlaze Processor Reference Guide”, Xilinx Corporation, October 5, 2005 [8] “PicoBlaze 8-bit Embedded Microcontroller User Guide”, Xilinx Corporation, November 21, 2005 [9] Tensilica Incorporated Website, www.tensilica.com, June 2006 [10] “Tensilica Diamond Standard Series Product Brief”, Tensilica Incorporated, 2006 [11] Tom R. Halfhill, “Tensilica’s Preconfigured Cores”, Microprocessor Report, March 20, 2006 [12] Xtensa Overview, www.tensilica.com/products/overview.htm, June 2006
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 36
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
References • • • • • • • •
[13] Create TIE, www.tensilica.com/products/create tie.htm, Septembe 2006 [14] TIE Compiler, www.tensilica.com/products/tie compiler.htm, September 2006 [15] XPRES Compiler, http://www.tensilica.com/products/xpres.htm, September 2006 [16] Opencores.org Website, www.opencores.org, June 2006 [17] UT Nios Homepage, www.eecg.toronto.edu/˜plavec/utnios.html, June 2006 [18] OpenSPARC Website, www.opensparc.org, June 2006 [19] Gaisler Research Website, www.gaisler.com, June 2006 [20] “LEON2 Processor User’s Manual XST Edition”, Gaisler Research, July 2005
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 37
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
References • • • • • •
•
[21] “GRLIB IP Core User’s Manual”, Gaisler Research, February 2006 [22] Broadcom Website, www.broadcom.com, June 2006 [23] “Broadcom CALISTOTM BCM1500 Leverages Multiple Xtensa Cores for VoIP ASSP”, Tensilica Incorporated Success Story, 2001 [24] “Tensilica Technology Helps Power World’s Fasted Router”, Tensilica Incorporated Press Release, August 2, 2004 [25] Advanced Electronics Designs Incorporated Homepage, http://www.advanced.pro/, September 2006 [26] Daughenbaugh, J., “Sign of the Times, Xilinx makes high-tech outdoor advertising in Times Square possible”, Xilinx Xcell Journal, Winter 2004 [27] Schaumont, P. and Verbauwhede, I., “ThumbPod puts security under your thumb”, Xilinx Xcell Journal, Winter 2004
18th International Conference on Microelectronics – December 16th – 19th, 2006
Slide 38