Instruction Summary for "P" ISA Extension Proposal - Google Groups

0 downloads 137 Views 950KB Size Report
Nov 23, 2017 - Revision History. Rev. Revision ...... Georgia. Normal. 12. Black. Command line, source code or file path
Instruction Summary for "P" ISA Extension Proposal

Document Number

xxxxxxxxx

Date Issued

2017-11-23

Copyright © 2017 Andes Technology Corporation. All rights reserved.

Copyright Notice Copyright © 2017 Andes Technology Corporation. All rights reserved. AndesCore™, AndeShape™, AndeSight™, AndESLive™, AndeSoft™, AndeStar™, AICE™, AICE-MCU™, AICE-MINI™, Andes Custom Extension™, and COPILOT™ are trademarks owned by Andes Technology Corporation. All other trademarks used herein are the property of their respective owners. This document contains confidential information of Andes Technology Corporation. Use of this copyright notice is precautionary and does not imply publication or disclosure. Neither the whole nor part of the information contained herein may be reproduced, transmitted, transcribed, stored in a retrieval system, or translated into any language in any form by any means without the written permission of Andes Technology Corporation. The product described herein is subject to continuous development and improvement; information herein is given by Andes in good faith but without warranties. This document is intended only to assist the reader in the use of the product. Andes Technology Corporation shall not be liable for any loss or damage arising from the use of any information in this document, or any incorrect use of the product.

Contact Information Should you have any problems with the information contained herein, please contact Andes Technology Corporation by email [email protected] or online website https://es.andestech.com/eservice/ for support giving: 

the document title



the document number



the page number(s) to which your comments apply



a concise explanation of the problem

General suggestions for improvements are welcome.

Instruction Summary for "P" ISA Extension Proposal

Revision History Rev.

Revision Date

0.1

2017/11/20

Revised

Revised Content

Chapter-Section

All

Initial release

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page ii

Instruction Summary for "P" ISA Extension Proposal

Table of Contents COPYRIGHT NOTICE ......................................................................................................................................................... I CONTACT INFORMATION ............................................................................................................................................... I REVISION HISTORY ........................................................................................................................................................ II TABLE OF CONTENTS .................................................................................................................................................... III LIST OF TABLES ................................................................................................................................................................ V 1.

2.

INTRODUCTION ........................................................................................................................................................ 1 1.1.

SIMD INSTRUCTIONS .............................................................................................................................................. 1

1.2.

NON-SIMD INSTRUCTIONS ..................................................................................................................................... 2

1.3.

ZERO-OVERHEAD LOOP MECHANISM ...................................................................................................................... 2

DSP ISA EXTENSION INSTRUCTION SUMMARY ........................................................................................ 4 2.1.

SHORTHAND DEFINITIONS ...................................................................................................................................... 4

2.2.

SIMD DATA PROCESSING INSTRUCTIONS ............................................................................................................... 5

2.2.1.

16-bit Addition & Subtraction Instructions ................................................................................................... 5

2.2.2.

8-bit Addition & Subtraction Instructions .................................................................................................... 7

2.2.3.

16-bit Shift Instructions ................................................................................................................................... 8

2.2.4.

16-bit Compare Instructions ........................................................................................................................... 9

2.2.5.

8-bit Compare Instructions .......................................................................................................................... 10

2.2.6.

16-bit Misc Instructions ................................................................................................................................. 10

2.2.7.

8-bit Misc Instructions .................................................................................................................................. 12

2.2.8.

8-bit Unpacking Instructions ....................................................................................................................... 12

2.3.

NON-SIMD DATA PROCESSING INSTRUCTIONS .................................................................................................... 14

2.3.1.

32-bit Addition/Subtraction Instructions ................................................................................................... 14

2.3.2.

32-bit Shift Instructions ................................................................................................................................ 14

2.3.3.

16-bit Packing Instructions ........................................................................................................................... 15

2.3.4.

Most Significant Word “32x32” Multiply & Add Instructions .................................................................. 15

2.3.5.

Most Significant Word “32x16” Multiply & Add Instructions ................................................................... 16

2.3.6.

Signed 16-bit Multiply with 32-bit Add/Subtract Instructions ................................................................ 17

2.3.7.

Signed 16-bit Multiply with 64-bit Add/Subtract Instructions ................................................................ 19

2.3.8.

Miscellaneous Instructions ........................................................................................................................... 20

2.3.9.

Q31 saturation Instructions .......................................................................................................................... 21

2.3.10.

Q15 saturation instructions ...................................................................................................................... 22

2.3.11.

Overflow status manipulation instructions ........................................................................................... 23

2.4. 2.4.1.

64-BIT INSTRUCTIONS ........................................................................................................................................... 24 64-bit Addition & Subtraction Instructions ................................................................................................ 24

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page iii

Instruction Summary for "P" ISA Extension Proposal 2.4.2.

32-bit Multiply with 64-bit Add/Subtract Instructions............................................................................. 26

2.4.3.

Signed 16-bit Multiply with 64-bit Add/Subtract Instructions ................................................................ 27

2.5. 3.

ZERO-OVERHEAD LOOP (ZOL) MECHANISM INSTRUCTIONS ................................................................................ 30

USER-MODE CSR REGISTERS ........................................................................................................................... 30 3.1.

LOOP BEGIN REGISTER .......................................................................................................................................... 30

3.2.

LOOP END REGISTER ............................................................................................................................................. 31

3.3.

LOOP COUNT REGISTER ......................................................................................................................................... 32

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page iv

Instruction Summary for "P" ISA Extension Proposal

List of Tables TABLE 1. SIMD 16-BIT ADD/SUBTRACT INSTRUCTIONS ............................................................................................................ 5 TABLE 2. SIMD 8-BIT ADD/SUBTRACT INSTRUCTIONS ............................................................................................................. 7 TABLE 3. SIMD 16-BIT SHIFT INSTRUCTIONS ............................................................................................................................ 8 TABLE 4. SIMD 16-BIT COMPARE INSTRUCTIONS ...................................................................................................................... 9 TABLE 5. SIMD 8-BIT COMPARE INSTRUCTIONS ..................................................................................................................... 10 TABLE 6. SIMD 16-BIT MISCELLANEOUS INSTRUCTIONS ........................................................................................................ 10 TABLE 7. SIMD 8-BIT MISCELLANEOUS INSTRUCTIONS .......................................................................................................... 12 TABLE 8. 8-BIT UNPACKING INSTRUCTIONS ............................................................................................................................. 12 TABLE 9. 32-BIT ADD/SUB INSTRUCTIONS .............................................................................................................................. 14 TABLE 10. 32-BIT SHIFT INSTRUCTIONS .................................................................................................................................. 14 TABLE 11. 16-BIT PACKING INSTRUCTIONS ............................................................................................................................... 15 TABLE 12. SIGNED MSW 32X32 MULTIPLY AND ADD INSTRUCTIONS ..................................................................................... 15 TABLE 13. SIGNED MSW 32X16 MULTIPLY AND ADD INSTRUCTIONS ..................................................................................... 16 TABLE 14. SIGNED 16-BIT MULTIPLY 32-BIT ADD/SUBTRACT INSTRUCTIONS ......................................................................... 17 TABLE 15. SIGNED 16-BIT MULTIPLY 64-BIT ADD/SUBTRACT INSTRUCTIONS ......................................................................... 19 TABLE 16. MISCELLANEOUS INSTRUCTIONS ............................................................................................................................. 20 TABLE 17. Q31 SATURATION ALU INSTRUCTIONS .................................................................................................................... 21 TABLE 18. Q15 SATURATION ALU INSTRUCTIONS .................................................................................................................... 22 TABLE 19. OV (OVERFLOW) FLAG SET/CLEAR INSTRUCTIONS................................................................................................. 23 TABLE 20. 64-BIT ADD/SUBTRACT INSTRUCTIONS .................................................................................................................. 24 TABLE 21. 32-BIT MULTIPLY 64-BIT ADD/SUBTRACT INSTRUCTIONS ...................................................................................... 26 TABLE 22. SIGNED 16-BIT MULTIPLY 64-BIT ADD/SUBTRACT INSTRUCTIONS ........................................................................ 27 TABLE 23. ZOL MECHANISM INSTRUCTIONS ........................................................................................................................... 30

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page v

Instruction Summary for "P" ISA Extension Proposal

Typographical Convention Index Document Element

Font

Font Style

Size

Color

Normal text

Georgia

Normal

12

Black

Command line,

Lucida Console

Normal

11

Indigo

LUCIDA CONSOLE BOLD + ALL-CAPS 11

INDIGO

Note or warning

Georgia

Normal

12

Red

Hyperlink

Georgia

Underlined

12

Blue

source code or file paths VARIABLES OR PARAMETERS IN COMMAND LINE, SOURCE CODE OR FILE PATHS

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page vi

Instruction Summary for "P" ISA Extension Proposal

1. Introduction Digital Signal Processing (DSP), has emerged as an important technology for modern electronic systems. A wide range of modern applications employ DSP algorithms to solve problems in their particular domains, including sensor fusion, servo motor control, audio decode/encode, speech synthesis and coding, MPEG4 decode, medical imaging, computer vision, embedded control, robotics, human interface, etc. The AndeStar™ DSP instruction set extension increases the DSP algorithm processing capabilities of the AndesCore™ CPU IP products. With the addition of the AndeStar™ DSP instruction set extension, the AndesCore CPUs can now run these various DSP applications with lower power and higher performance. This DSP instruction set extension adds 8-bits and 16-bits SIMD instructions to increase the throughput of 8-bits and 16-bits DSP computations, so more work can be done in a fixed time slot or a task can be completed faster. It also adds enhanced 16-bits, 32-bits, 64-bits non-SIMD instructions to speed up frequent operations in DSP algorithms. To reduce the looping overhead of a repeated performance-critical DSP computation, this extension also includes a hardware zero-overhead loop mechanism.

1.1.

SIMD Instructions

Using the AndeStar V5 baseline 32-bit registers, we can perform four 8-bit operations or two 16-bit operations in parallel to maximize the throughput of these 8-bit and 16-bit compuations. And there are many DSP applications that can benefit from this performance feature. Therefore, this DSP instruction set extension adds many 8-bit and 16-bit SIMD instructions. The 8-bit SIMD instructions include a variety of signed/unsigned addition and subtraction operations, signed/unsigned comparison operations, signed/unsigned maximum and minimum operations, signed/unsigned unpacking operations, and signed absolute value operation.

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 1

Instruction Summary for "P" ISA Extension Proposal The 16-bit SIMD instructions include a variety of signed/unsigned addition and subtraction operations, different types of shift operations, signed/unsigned comparison operations, signed/unsigned maximum and minimum operations, signed/unsigned multiplication operations, signed/unsigned clipping operations, and signed absolute value operation.

1.2.

Non-SIMD Instructions

The non-SIMD instructions in this DSP extension include 16-bit packing operations, Q15 and Q31 saturating addition, subtraction, multiplication operations, 32-bit signed/unsigned halving addition and subtraction operations, 32-bit saturating left shift and rounding right shift operations, most significant word “32x32 multiply & add” operations, most signification word “32x16 multiply & add” operations, a variety of 32-bit accumulation or subtraction with 16-bit multiplication operations, and bit reverse, bit-wise selection, byte insertion, 32-bit word extraction from 64-bit data operations. To speed up 64-bit operations in DSP applications, this extension also includes a variety of 64-bit addition and subtraction operations, signed/unsigned 64-bit accumulation or subtraction with 32-bit multiplication operations, and signed 64-bit accumulation or subtraction with 16-bit multiplication operations.

1.3.

Zero-overhead Loop Mechanism

A set of Zero-Overhead Loop mechanism is provided to reduce the instruction fetch and execution overhead of loop-control instructions. Three user-mode CSR registers are provided to support this mechanism. 

LB: stores the starting address of a loop. It is 32-bit. It can be written with “MTLBI” instruction.



LE: stores the ending address of a loop. It is 32-bit. The value of LE should be greater than or equal to LB. If this rule is violated, UPREDICTABLE behavior will happen. It can be written with “MTLEI” instruction.



LC: contains the loop count number that the zero-overhead looping operation will be performed. It is 32-bit. When LC is greater than 1, any execution of an instruction in an

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 2

Instruction Summary for "P" ISA Extension Proposal address that matches the value of LE will cause the Program Counter to change value to the content of LB and will decrement the value of LC by 1. When LC is less than or equal to 1, the zero-overhead loop mechanism will be turned off. It is used for any loop that needs to be executed at least once. For example, do { ......... } until (count > 4);

The zero-overhead looping operation can be summarized as follows: If ((LC > 1) && (PC == LE)) { LC = LC – 1; PC = LB; }

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 3

Instruction Summary for "P" ISA Extension Proposal

2. DSP ISA Extension Instruction Summary 2.1.

Shorthand Definitions



r.H == rH1  r[31:16], r.L == r.H0  r[15:0]



r.B3  r[31:24], r.B2  r[23:16], r.B1  r[15:8], r.B0  r[7:0]



r[xU]  the upper 32-bit of a 64-bit number; xU represents the GPR number that contains this upper part 32-bit value.



r[xL]  the lower 32-bit of a 64-bit number; xL represents the GPR number that contains this lower part 32-bit value.



r[xU].r[xL]  a 64-bit number that is formed from a pair of GPRs.



s>>  signed arithmetic right shift



u>>  unsigned logical right shift



SAT.Qn()  Saturate to the range of [-2n, 2n-1], if saturation happens, set PSW.OV.



SAT.Um()  Saturate to the range of [0, 2m-1], if saturation happens, set PSW.OV.



RUND()  Indicate “rounding”, i.e., add 1 to the most significant discarded bit for right shift or MSW-type multiplication instructions.



Sign or Zero Extending functions: 

SEm(data)  Sign-Extend data to m-bit.



ZEm(data)  Zero-Extend data to m-bit.



ABS(x)  Calculate the absolute value of “x”.



CONCAT(x,y)  Concatinate “x” and “y” to form a value.



u<  Unsinged less than comparison.



u  Unsinged greater than comparison.



s*  Signed multiplication.



u*  Unsigned multiplication.



rt is Rd in RISC-V ISA terminology.



ra is Rs1 and rb is Rs2 in RISC-V ISA terminology.

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 4

Instruction Summary for "P" ISA Extension Proposal 2.2.

SIMD Data Processing Instructions

2.2.1.

16-bit Addition & Subtraction Instructions

The SIMD 16-bit add/subtract instructions support 4 types of operations: Addition (two 16-bit additions), Subtraction (two 16-bit subtractions), Crossed Add & Sub, and Crossed Sub & Add. The overflow handling of these instructions can have 5 variations: Wrap-around (dropping overflow), Signed Halving (keeping overflow by dropping 1 LSB bit), Unsigned Halving, Signed Saturation (clipping overflow), and Unsigned Saturation. Together, there are 20 SIMD 16-bit add/subtract instructions. Table 1. SIMD 16-bit Add/Subtract Instructions Mnemonic

Instruction

Operation rt.Hx = ra.Hx + rb.Hx;

ADD16 rt, ra, rb

16-bit Addition

RADD16 rt, ra, rb

16-bit Signed Halving Addition

rt.Hx = (ra.Hx + rb.Hx) s>> 1; (x=1..0)

URADD16 rt, ra, rb

16-bit Unsigned Halving Addition

rt.Hx = (ra.Hx + rb.Hx) u>> 1; (x=1..0)

KADD16 rt, ra, rb

16-bit Signed Saturating Addition

UKADD16 rt, ra, rb

16-bit Unsigned Saturating Addition

SUB16 rt, ra, rb

16-bit Subtraction

RSUB16 rt, ra, rb

16-bit Signed Halving Subtraction

(x=1..0)

rt.Hx = SAT.Q15(ra.Hx + rb.Hx); (x=1..0) rt.Hx = SAT.U16(ra.Hx + rb.Hx); (x=1..0) rt.Hx = ra.Hx - rb.Hx; (x=1..0) rt.Hx = (ra.Hx - rb.Hx) s>> 1; (x=1..0)

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 5

Instruction Summary for "P" ISA Extension Proposal Mnemonic

Instruction

URSUB16 rt, ra, rb

16-bit Unsigned Halving Subtraction

KSUB16 rt, ra, rb

16-bit Signed Saturating Subtraction

UKSUB16 rt, ra, rb

16-bit Unsigned Saturating Subtraction

CRAS16 rt, ra, rb

16-bit Cross Add & Sub

RCRAS16 rt, ra, rb

16-bit Signed Halving Cross Add & Sub

URCRAS16 rt, ra, rb

KCRAS16 rt, ra, rb

UKCRAS16 rt, ra, rb

rt.Hx = SAT.Q15(ra.Hx - rb.Hx); (x=1..0) rt.Hx = SAT.U16(ra.Hx - rb.Hx); (x=1..0) rt.H = ra.H + rb.L; rt.L = ra.L – rb.H; rt.H = (ra.H + rb.L) s>> 1; rt.L = (ra.L – rb.H) s>> 1;

Sub

rt.L = (ra.L – rb.H) u>> 1;

16-bit Signed Saturating Cross Add &

rt.H = SAT.Q15(ra.H + rb.L);

Sub

rt.L = SAT.Q15(ra.L – rb.H);

16-bit Unsigned Saturating Cross Add

rt.H = SAT.U16(ra.H + rb.L);

& Sub

rt.L = SAT.U16(ra.L – rb.H);

RCRSA16 rt, ra, rb

16-bit Signed Halving Cross Sub & Add

UKCRSA16 rt, ra, rb

(x=1..0)

rt.H = (ra.H + rb.L) u>> 1;

16-bit Cross Sub & Add

KCRSA16 rt, ra, rb

rt.Hx = (ra.Hx - rb.Hx) u>> 1;

16-bit Unsigned Halving Cross Add &

CRSA16 rt, ra, rb

URCRSA16 rt, ra, rb

Operation

rt.H = ra.H - rb.L; rt.L = ra.L + rb.H; rt.H = (ra.H - rb.L) s>> 1; rt.L = (ra.L + rb.H) s>> 1;

16-bit Unsigned Halving Cross Sub &

rt.H = (ra.H - rb.L) u>> 1;

Add

rt.L = (ra.L + rb.H) u>> 1;

16-bit Signed Saturating Cross Sub &

rt.H = SAT.Q15(ra.H - rb.L);

Add

rt.L = SAT.Q15(ra.L + rb.H);

16-bit Unsigned Saturating Cross Sub

rt.H = SAT.U16(ra.H - rb.L);

& Add

rt.L = SAT.U16(ra.L + rb.H);

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 6

Instruction Summary for "P" ISA Extension Proposal 2.2.2.

8-bit Addition & Subtraction Instructions

The SIMD 8-bit add/subtract instructions support 2 types of operations: Addition (four 8-bit additions), Subtraction (four 8-bit subtractions). The overflow handling of these instructions can have 5 variations: Wrap-around (dropping overflow), Signed Halving (keeping overflow by dropping 1 LSB bit), Unsigned Halving, Signed Saturation (clipping overflow), and Unsigned Saturation. Together, there are 10 SIMD 8-bit add/subtract instructions. Table 2. SIMD 8-bit Add/Subtract Instructions Mnemonic

Instruction

ADD8 rt, ra, rb

8-bit Addition

RADD8 rt, ra, rb

8-bit Signed Halving Addition

URADD8 rt, ra, rb

8-bit Unsigned Halving Addition

KADD8 rt, ra, rb

8-bit Signed Saturating Addition

UKADD8 rt, ra, rb

8-bit Unsigned Saturating Addition

SUB8 rt, ra, rb

8-bit Subtraction

RSUB8 rt, ra, rb

8-bit Signed Halving Subtraction

URSUB8 rt, ra, rb

8-bit Unsigned Halving Subtraction

Operation rt.Bx = ra.Bx + rb.Bx; (x=3..0) rt.Bx = (ra.Bx + rb.Bx) s>> 1; (x=3..0) rt.Bx = (ra.Bx + rb.Bx) u>> 1; (x=3..0) rt.Bx = SAT.Q7(ra.Bx + rb.Bx); (x=3..0) rt.Bx = SAT.U8(ra.Bx + rb.Bx); (x=3..0) rt.Bx = ra.Bx - rb.Bx; (x=3..0) rt.Bx = (ra.Bx - rb.Bx) s>> 1; (x=3..0) rt.Bx = (ra.Bx - rb.Bx) u>> 1; (x=3..0)

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 7

Instruction Summary for "P" ISA Extension Proposal Mnemonic

Instruction

KSUB8 rt, ra, rb

8-bit Signed Saturating Subtraction

UKSUB8 rt, ra, rb

8-bit Unsigned Saturating Subtraction

2.2.3.

Operation rt.Bx = SAT.Q7(ra.Bx - rb.Bx); (x=3..0) rt.Bx = SAT.U8(ra.Bx - rb.Bx); (x=3..0)

16-bit Shift Instructions

There are 13 instructions here. Table 3. SIMD 16-bit Shift Instructions Mnemonic SRA16 rt, ra, rb

SRAI16 rt, ra, im4u

SRA16.u rt, ra, rb

SRAI16.u rt, ra, im4u

Instruction 16-bit Shift Right Arithmetic

(x=1..0) rt.Hx = ra.Hx s>> im4u;

Immediate

(x=1..0)

16-bit Rounding Shift Right Arithmetic

rt.Hx = RUND(ra.Hx s>> rb[3:0]); (x=1..0)

16-bit Rounding Shift Right Arithmetic

rt.Hx = RUND(ra.Hx s>> im4u);

Immediate

(x=1..0)

16-bit Shift Right Logical

SRLI16 rt, ra, im4u

16-bit Shift Right Logical Immediate

SRL16.u rt, ra, rb

16-bit Rounding Shift Right Logical

SLL16 rt, ra, rb

rt.Hx = ra.Hx s>> rb[3:0];

16-bit Shift Right Arithmetic

SRL16 rt, ra, rb

SRLI16.u rt, ra, im4u

Operation

rt.Hx = ra.Hx u>> rb[3:0]; (x=1..0) rt.Hx = ra.Hx u>> im4u; (x=1..0) rt.Hx = RUND(ra.Hx u>> rb[3:0]); (x=1..0)

16-bit Rounding Shift Right Logical

rt.Hx = RUND(ra.Hx u>> im4u);

Immediate

(x=1..0)

16-bit Shift Left Logical

rt.Hx = ra.Hx 0) rt.Hx = SAT.Q15(ra.Hx > -rb[4:0]);

Saturation & Rounding Shift Right

if (rb[4:0] > 0)

Arithmetic

rt.Hx = SAT.Q15(ra.Hx >

KHMX16 rt, ra, rb

16-bit Crossed Signed Multiply

15); (x,y)=(1,0), (0,1)

SMUL16 rt, ra, rb

SMULX16 rt, ra, rb

UMUL16 rt, ra, rb

UMULX16 rt, ra, rb

KABS16 rt, ra

16-bit Signed Multiply to 32-bit

r[tU] = ra.H1 s* rb.H1; r[tL] = ra.H0 s* rb.H0;

16-bit Signed Crossed Multiply to

r[tU] = ra.H1 s* rb.H0;

32-bit

r[tL] = ra.H0 s* rb.H1;

16-bit Unsigned Multiply to 32-bit

r[tU] = ra.H1 u* rb.H1; r[tL] = ra.H0 u* rb.H0;

16-bit Unsigned Crossed Multiply to

r[tU] = ra.H1 u* rb.H0;

32-bit

r[tL] = ra.H0 u* rb.H1;

16-bit Absolute Value

rt.Hx = SAT.Q15(ABS(ra.Hx)); (x=1..0)

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 11

Instruction Summary for "P" ISA Extension Proposal 2.2.7.

8-bit Misc Instructions

There are 5 instructions here. Table 7. SIMD 8-bit Miscellaneous Instructions Mnemonic

Instruction

SMIN8 rt, ra, rb

8-bit Signed Minimum

UMIN8 rt, ra, rb

8-bit Unsigned Minimum

SMAX8 rt, ra, rb

8-bit Signed Maximum

UMAX8 rt, ra, rb

8-bit Unsigned Maximum

KABS8 rt, ra

8-bit Absolute Value

2.2.8.

Operation rt.Bx = (ra.Bx < rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = (ra.Bx u< rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = (ra.Bx > rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = (ra.Bx u> rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = SAT.Q7(ABS(ra.Bx)); (x=3..0)

8-bit Unpacking Instructions

There are 8 instructions here. Table 8. 8-bit Unpacking Instructions Mnemonic

Instruction

SUNPKD810 rt, ra

Signed Unpacking Bytes 1 & 0

SUNPKD820 rt, ra

Signed Unpacking Bytes 2 & 0

SUNPKD830 rt, ra

Signed Unpacking Bytes 3 & 0

SUNPKD831 rt, ra

Signed Unpacking Bytes 3 & 1

ZUNPKD810 rt, ra

Unsigned Unpacking Bytes 1 & 0

Operation rt.Hx = SE16(ra.By); (x,y) = (1,1), (0,0) rt.Hx = SE16(ra.By); (x,y) = (1,2), (0,0) rt.Hx = SE16(ra.By); (x,y) = (1,3), (0,0) rt.Hx = SE16(ra.By); (x,y) = (1,3), (0,1) rt.Hx = ZE16(ra.By); (x,y) = (1,1), (0,0)

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 12

Instruction Summary for "P" ISA Extension Proposal Mnemonic

Instruction

ZUNPKD820 rt, ra

Unsigned Unpacking Bytes 2 & 0

ZUNPKD830 rt, ra

Unsigned Unpacking Bytes 3 & 0

ZUNPKD831 rt, ra

Unsigned Unpacking Bytes 3 & 1

Operation rt.Hx = ZE16(ra.By); (x,y) = (1,2), (0,0) rt.Hx = ZE16(ra.By); (x,y) = (1,3), (0,0) rt.Hx = ZE16(ra.By); (x,y) = (1,3), (0,1)

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 13

Instruction Summary for "P" ISA Extension Proposal 2.3.

Non-SIMD Data Processing Instructions

2.3.1.

32-bit Addition/Subtraction Instructions

There are 4 instructions here. Table 9. 32-bit Add/Sub Instructions Mnemonic

Instruction

Operation

RADDW rt, ra, rb

32-bit Signed Halving Addition

rt = (ra + rb) s>> 1

URADDW rt, ra, rb

32-bit Unsigned Halving Addition

rt = (ra + rb) u>> 1

RSUBW rt, ra, rb

32-bit Signed Halving Subtraction

rt = (ra - rb) s>> 1

URSUBW rt, ra, rb

32-bit Unsigned Halving Subtraction

rt = (ra - rb) u>> 1

2.3.2.

32-bit Shift Instructions

There are 5 instructions here. Table 10. 32-bit Shift Instructions Mnemonic SRA.u rt, ra, rb SRAI.u rt, ra, imm5u

KSLL rt, ra, rb

KSLLI rt, ra, imm5u

Instruction Rounding Shift Right Arithmetic Rounding Shift Right Arithmetic Immediate Saturating Shift Left Logical Saturating Shift Left Logical Immediate

Operation rt = RUND(ra s>> rb[4:0]) rt = RUND(ra s>> imm5u) rt = SAT.Q31(ra SAT.U5(-rb[7:0]));

Rounding Shift Right Arithmetic

if (rb[7:0] > 0) rt = SAT.Q31(ra 2 imm5u -1) {

UCLIP32 Rd, Rs1, imm5u

Clip Value

Rd=2 imm5u -1; OV=1; } else if (Rs1 < 0) { Rd=0; OV=1; } } else { Rd=Rs1; }

SCLIP32 Rd, Rs1, imm5u

Clip Value Signed

If (Rs1 > 2 imm5u -1) { Rd=2 imm5u -1; OV=1;

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 20

Instruction Summary for "P" ISA Extension Proposal Mnemonic

Instruction

Operation } else if (Rs1 < -2 imm5u ) { Rd=-2 imm5u; OV=1; } else { Rd=Rs1; } Rd =

CLZ Rd, Rs1

Count leading zero

CLO Rd, Rs1

Count leading one

MAX Rd, Rs1, Rs2

Return the larger signed value

Rd = signed-max(Rs1, Rs2)

MIN Rd, Rs1, Rs2

Return the smaller signed value

Rd = signed-min (Rs1, Rs2)

AVE Rd, Rs1, Rs1

Average two signed integers with rounding

COUNT_ZERO_FROM_MSB(Rs1) Rd = COUNT_ONE_FROM_MSB(Rs1)

Rd = (Rs1 + Rs2 + 1) (arith) >> 1 a = ABS(Rs1(7,0) – Rs2(7,0));

PBSAD Rd, Rs1, Rs1

Parallel Byte Sum of Absolute Difference

b = ABS(Rs1(15,8) – Rs2(15,8)); c = ABS(Rs1(23,16) – Rs2(23,16)); d = ABS(Rs1(31,24) – Rs2(31,24)); Rd = a + b + c + d; a = ABS(Rs1(7,0) – Rs2(7,0));

PBSADA Rd, Rs1, Rs1

Parallel Byte Sum of Absolute Difference Accumulate

b = ABS(Rs1(15,8) – Rs2(15,8)); c = ABS(Rs1(23,16) – Rs2(23,16)); d = ABS(Rs1(31,24) – Rs2(31,24)); Rd = Rd + a + b + c + d;

2.3.9.

Q31 saturation Instructions

The following table lists instructions related to Q31 arithmetic. Table 17. Q31 saturation ALU Instructions Mnemonic

Instruction

Operation

KADDW Rt, Ra, Rb

Add with Q31 saturation.

Rt = SAT.Q31(Ra + Rb)

KSUBW

Subtract with Q31 saturation.

Rt = SAT.Q31(Ra – Rb)

Rt, Ra, Rb

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 21

Instruction Summary for "P" ISA Extension Proposal Mnemonic

KSLRAW Rt, Ra, Rb

Instruction Logical left shift or arithmetic right shift with Q31 saturation.

Operation (Rb[7:0] >=0) ? Rt = SAT.Q31(Ra > -Rb[7:0])

Multiply Q15 numbers with Q31 KDMBB Rt, Ra, Rb

saturation in bottom parts of two

Rt = SAT.Q31(Ra.H0 * Rb.H0)

registers. Multiply Q15 numbers with Q31 KDMTB Rt, Ra, Rb

saturation in top and bottom parts of

Rt = SAT.Q31(Ra.H1 * Rb.H0)

two registers. Multiply Q15 numbers with Q31 KDMBT Rt, Ra, Rb

saturation in bottom and top parts of

Rt = SAT.Q31(Ra.H0 * Rb.H1)

two registers. KDMTT Rt, Ra, Rb

Multiply Q15 numbers with Q31 saturation in top parts of two registers.

Rt = SAT.Q31(Ra.H1 * Rb.H1)

2.3.10. Q15 saturation instructions The following table lists instructions related to Q15 arithmetic. Table 18. Q15 saturation ALU Instructions Mnemonic

Instruction

Operation

KADDH Rt, Ra, Rb

Add with Q15 saturation.

Rt = SAT.Q15(Ra + Rb)

KSUBH

Subtract with Q15 saturation

Rt = SAT.Q15(Ra – Rb)

Rt, Ra, Rb

Multiply Q15 numbers in bottom parts KHMBB Rt, Ra, Rb

of two registers and extract high part with Q15 saturation.

Rt = SAT.Q15((Ra.H0 * Rb.H0) s>> 15)

Multiply Q15 numbers in top and KHMTB Rt, Ra, Rb

bottom parts of two registers and

Rt = SAT.Q15((Ra.H1 * Rb.H0) s>> 15)

extract high part with Q15 saturation. KHMBT Rt, Ra, Rb

Multiply Q15 numbers in bottom and top parts of two registers and extract

Rt = SAT.Q15((Ra.H0 * Rb.H1) s>> 15)

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 22

Instruction Summary for "P" ISA Extension Proposal high part with Q15 saturation. Multiply Q15 numbers in top parts of KHMTT Rt, Ra, Rb

two registers and extract high part with

Rt = SAT.Q15((Ra.H1 * Rb.H1) s>> 15)

Q15 saturation.

2.3.11. Overflow status manipulation instructions The following table lists the user instructions related to Overflow (OV) flag manipulation. Table 19. OV (Overflow) flag Set/Clear Instructions Mnemonic

Instruction

Operation

RDOV Rt

Read mxstatus.OV to Rt.

Rt = ZE32(mxstatus.OV)

CLROV

Clear mxstatus.OV flag.

mxstatus.OV = 0

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 23

Instruction Summary for "P" ISA Extension Proposal 2.4.

64-bit Instructions

2.4.1.

64-bit Addition & Subtraction Instructions Table 20. 64-bit Add/Subtract Instructions Mnemonic

Instruction

Operation a64 = r[aU].r[aL]; b64 = r[bU].r[bL];

ADD64 rt, ra, rb

64-bit Addition

t64 = a64 + b64;

r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];

RADD64 rt, ra, rb

64-bit Signed Halving Addition

t64 = (a64 + b64) s>>1;

r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];

URADD64 rt, ra, rb

64-bit Unsigned Halving Addition

t64 = (a64 + b64) u>>1;

r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];

KADD64 rt, ra, rb

64-bit Signed Saturating Addition

t64 = SAT.Q63(a64 + b64);

r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];

UKADD64 rt, ra, rb

64-bit Unsigned Saturating Addition

t64 = SAT.U64(a64 + b64);

r[tU].r[tL] = t64;

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 24

Instruction Summary for "P" ISA Extension Proposal Mnemonic

Instruction

Operation a64 = r[aU].r[aL]; b64 = r[bU].r[bL];

SUB64 rt, ra, rb

64-bit Subtraction

t64 = a64 - b64;

r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];

RSUB64 rt, ra, rb

64-bit Signed Halving Subtraction

t64 = (a64 - b64) s>>1;

r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];

URSUB64 rt, ra, rb

64-bit Unsigned Halving Subtraction

t64 = (a64 - b64) u>>1;

r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];

KSUB64 rt, ra, rb

64-bit Signed Saturating Subtraction

t64 = SAT.Q63(a64 - b64);

r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];

UKSUB64 rt, ra, rb

64-bit Unsigned Saturating Subtraction

t64 = SAT.U64(a64 - b64);

r[tU].r[tL] = t64;

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 25

Instruction Summary for "P" ISA Extension Proposal 2.4.2.

32-bit Multiply with 64-bit Add/Subtract Instructions Table 21. 32-bit Multiply 64-bit Add/Subtract Instructions Mnemonic

Instruction

Operation c64 = r[tU].r[tL];

SMAR64 rt, ra, rb

32x32 with 64-bit Signed Addition

t64 = c64 + ra*rb; // signed

r[tU].r[tL] = t64; c64 = r[tU].r[tL];

SMSR64 rt, ra, rb

32x32 with 64-bit Signed Subtraction

t64 = c64 - ra*rb; // signed

r[tU].r[tL] = t64; c64 = r[tU].r[tL];

UMAR64 rt, ra, rb

32x32 with 64-bit Unsigned Addition

t64 = c64 + ra*rb; // unsigned

r[tU].r[tL] = t64; c64 = r[tU].r[tL];

UMSR64 rt, ra, rb

32x32 with 64-bit Unsigned Subtraction

t64 = c64 - ra*rb; // unsigned

r[tU].r[tL] = t64; c64 = r[tU].r[tL];

KMAR64 rt, ra, rb

32x32 with Saturating 64-bit Signed Addition

t64 = SAT.Q63(c64 + ra*rb);

r[tU].r[tL] = t64;

KMSR64 rt, ra, rb

32x32 with Saturating 64-bit Signed Subtraction

c64 = r[tU].r[tL];

t64 = SAT.Q63(c64 – ra*rb);

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 26

Instruction Summary for "P" ISA Extension Proposal Mnemonic

Instruction

Operation

r[tU].r[tL] = t64; c64 = r[tU].r[tL];

UKMAR64 rt, ra, rb

32x32 with Saturating 64-bit Unsigned Addition

t64 = SAT.U64(c64 + ra*rb);

r[tU].r[tL] = t64; c64 = r[tU].r[tL];

UKMSR64 rt, ra, rb

32x32 with Saturating 64-bit Unsigned Subtraction

t64 = SAT.U64(c64 - ra*rb);

r[tU].r[tL] = t64;

2.4.3.

Signed 16-bit Multiply with 64-bit Add/Subtract Instructions Table 22. Signed 16-bit Multiply 64-bit Add/Subtract Instructions Mnemonic

Instruction

Operation c64 = r[tU].r[tL];

“Bottom 16 x Bottom 16” with 64-bit SMALBB rt, ra, rb

Signed Addition

t64 = c64 + ra.L*rb.L;

(64 = 64 + 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; “Bottom 16 x Top 16” with 64-bit SMALBT rt, ra, rb

Signed Addition

t64 = c64 + ra.L*rb.H;

(64 = 64 + 16x16) r[tU].r[tL] = t64; “Top 16 x Top 16” with 64-bit Signed SMALTT rt, ra, rb

Addition (64 = 64 + 16x16)

c64 = r[tU].r[tL];

t64 = c64 + ra.H*rb.H;

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 27

Instruction Summary for "P" ISA Extension Proposal Mnemonic

Instruction

Operation r[tU].r[tL] = t64;

c64 = r[tU].r[tL]; Two “16x16” with 64-bit Signed Double SMALDA rt, ra, rb

Addition

t64 = c64 + ra.H*rb.H + ra.L*rb.L;

(64 = 64 + 16x16 + 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two Crossed “16x16” with 64-bit SMALXDA rt, ra, rb

Signed Double Addition

t64 = c64 + ra.H*rb.L + ra.L*rb.H;

(64 = 64 + 16x16 + 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two “16x16” with 64-bit Signed SMALDS rt, ra, rb

Addition and Subtraction

t64 = c64 + ra.H*rb.H - ra.L*rb.L;

(64 = 64 + 16x16 - 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two “16x16” with 64-bit Signed SMALDRS rt, ra, rb

Addition and Reversed Subtraction

t64 = c64 + ra.L*rb.L - ra.H*rb.H;

(64 = 64 + 16x16 - 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two Crossed “16x16” with 64-bit SMALXDS rt, ra, rb

Signed Addition and Subtraction

t64 = c64 + ra.H*rb.L - ra.L*rb.H;

(64 = 64 + 16x16 - 16x16) r[tU].r[tL] = t64; Two “16x16” with 64-bit Signed Double SMSLDA rt, ra, rb

Subtraction (64 = 64 - 16x16 - 16x16)

c64 = r[tU].r[tL];

t64 = c64 - ra.H*rb.H - ra.L*rb.L;

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 28

Instruction Summary for "P" ISA Extension Proposal Mnemonic

Instruction

Operation r[tU].r[tL] = t64; c64 = r[tU].r[tL];

Two Crossed “16x16” with 64-bit SMSLXDA rt, ra, rb

Signed Double Subtraction

t64 = c64 - ra.H*rb.L - ra.L*rb.H;

(64 = 64 - 16x16 - 16x16) r[tU].r[tL] = t64;

The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx

Page 29

Instruction Summary for "P" ISA Extension Proposal 2.5.

Zero-Overhead Loop (ZOL) Mechanism Instructions

The following table lists the instructions in the Zero-Overhead Loop Mechanism. Table 23. ZOL Mechanism Instructions Mnemonic MTLBI imm16s MTLEI imm16s

Instruction Move to Loop Begin register Immediate. Move to Loop End register Immediate.

Operation LB = PC + SE32(imm16s