Very High Performance Embedded Microcontroller with ... - Cortus SAS

5 downloads 209 Views 2MB Size Report
Embedded Microcontroller with Dual Issue Pipeline. The APS29 from Cortus provides a ... and require no effort from the p
APS29 Cortus APS29 CPU Core ALU

Co Processor

ALU

34

The dual functional units contain a single cycle ALU and multiplier/divider each, and there is one multiply-accumulate unit with a 64 bit accumulator. The load/store unit can coalesce reads and writes onto the 64 bit data bus.

Performance CoreMark/MHz 1.0 : 3.62 / GCC5.3.0 20160730 (Cortus 32b) -mcpu=aps29 -g -Ofast -fno-lto DMEM_METHOD=MEM_MALLOC -fuse-clib=minifp -mstrict-alignment mmac -DPERFORMANCE_RUN=1 / Heap

l l

DMIPS 3.09 DMIPS/MHz At least 1400 MHz in 28 nm

Implementation Results Fmax

Area

Interrupt Controller

R15

34

ACC

Timer

MACC

AXI4

RAM

Status

GPIO

AXI4

I-Cache D-Cache 64

Watchdog

64

l l l l l l l

optional

X-Bar

The APS29 is a very high performance, Flash extendible 32 bit microcontroller core DMA featuring a dual issue pipeline ensuring On-Chip Debug very high integer throughput. The dual issue pipeline provides instruction level parallelism and increases performance without any Features requirements to change coding styles or complex l Dual Issue, 5-7 Stage Pipeline compilation schemes. All the performance increases are managed within the processor core l Multiply - Accumulate and require no effort from the programmer.

l

UART

R0 = 0 optional

The APS29 from Cortus provides a solution to your performance challenges while staying within stringent power and silicon footprint budgets. A dual issue pipeline gives performance close to a dual core system with only a modest increase in silicon area and power consumption over a similar single issue core. A static branch predictor dramatically improves the performance of loops and the multiply-accumulate feature increases signal processing speeds. These features offer a significant performance boost but require no special programming techniques.

optional

advanced processing solutions

Very High Performance Embedded Microcontroller with Dual Issue Pipeline

APB Bridge AHB Bridge

2 High Performance Integer Multipliers 2 Integer Dividers Dual & Multi-Core Capable Co-Processor Interface AXI4 buses with 64 bit data width Coalesced reads/writes Optional Caches

The APS29 has been designed to offer excellent code density with mixed instruction lengths. The dual issue out-of-order completion pipeline ensures that most instructions (including loads and stores) execute in a single cycle, with two instructions being issued per cycle. A static branch predictor significantly improves the execution speed of loops.

Power

2 28 nm (TSMC) 1400 MHz 0.037 mm 17.54 µW/MHz

www.cortus.com

Near Dual Core Performance - With the Simplicity of a Single Core Dual Issue Pipeline

Cortus Version 2 Instruction Set

The dual issue pipeline enables the processor to potentially start the execution of two instructions per cycle. The processor features two single cycle ALUs, two multipliers and integer dividers, these are grouped into two functional units each capable of starting one instruction per cycle each. The load/store unit which manages data access is also handled by the pipeline.

The APS29 is based on the Cortus v2 instruction set. Following extensive analysis of a wide range of embedded programs the version 2 processor cores use a careful selection of 16, 24 and 32 bit instructions. These have been chosen to balance the size of the instruction memory against a minimal core size.

Typical Embedded Code

The pipeline manages internal resources and instruction interdependencies with no risk of conflict for the programmer to manage. The compiler schedules instructions to optimise instruction throughput. The 64 bit memory interface supplies the processor with up to four instructions per cycle which are held in the 8 word FIFO prior to execution.

APS5 APS29

Average 18% Improvement

Static Branch Predictor The APS29 features a static branch predictor that significantly improves the performance, notably in loops. A simple but effective prediction scheme is employed that ensures that in the majority of cases branches can be executed without the penalty associated with flushing the pipeline. The compiler ensures that loops and other constructs are optimised to take advantage of the branch predictor.

Multiply - Accumulate The Multiply-Accumulate unit of the APS29 offers a single cycle multiply accumulate operation into a dedicated 64 bit accumulator. Two signed or unsigned 32 bit integers are multiplied and either added into or subtracted from the accumulator.

Ecosystem The APS29 benefits from the shared ecosystem of the APS2n and APS families. It has a complete software development environment including toolchain for C and C++, a complete adapted IDE based on one of the most widely used IDEs - Eclipse. Debugging is fully supported with an integrated instruction set simulator, the Cortus onchip-debuging hardware and an Ethernet connected JTAG interface - the EtherTag. Ports of various RTOSs are available such as FreeRTOS, Micrium µC/OS, µCLinux…

Going Further If the computation performance or throughput of the APS29 is stretched by your application there are a number of possible solutions.

These operations enable the efficient implementation of a large number of signal processing algorithms such as FFT, filters etc.

Simple dual core systems can significantly increase the processing power of a system for little silicon cost. The processing power can be further increased using multicore architectures, with a coherent data cache.

The compiler is aware of the Multiply-Accumulate instructions and can optimise typical “C/C++” constructs that can be efficiently implemented with these instructions.

Equally it is easy to realise heterogeneous multiprocessor systems, for example pairing an APS29 for time critical data processing with an APS23 to handle I/O and a Bluetooth network stack.

Co-Processors

The easy integration of multiple cores enables the creation of secure systems where one processor supervises and checks the operation of the other. This effectively and reliably improves either the safety or security of an embedded system. A coherent data cache is available, supporting multi-core architectures.

In a number of cases an algorithm can be accelerated significantly with the use of a co-processor. Either implementing the entire algorithm in hardware or just key elements. The APS29 supports the easy to use Cortus coprocessor interface. This enables the engineer to extend the instruction set of the APS29, co-processor instructions suffer no penalties compared to native instructions and have full access to the register set. As with all Cortus processors that have a co-processor interface, co-processor instructions are first class instructions. The dual issue pipeline can start a coprocessor instruction in the same cycle as a native instruction, handling resource conflicts and out-of-order completion without programmer intervention.

Applications The APS29 is suited to a wide variety of applications, such as:

l l l l l l

Embedded Control Encryption and Decryption Wireless and Wireline Communication Sensor Fusion Machine Vision Dual and Multi-core Systems

[email protected] Copyright Cortus SAS © 2016

Suggest Documents