Nov 23, 2017 - Revision History. Rev. Revision ...... Georgia. Normal. 12. Black. Command line, source code or file path
Instruction Summary for "P" ISA Extension Proposal
Document Number
xxxxxxxxx
Date Issued
2017-11-23
Copyright © 2017 Andes Technology Corporation. All rights reserved.
Copyright Notice Copyright © 2017 Andes Technology Corporation. All rights reserved. AndesCore™, AndeShape™, AndeSight™, AndESLive™, AndeSoft™, AndeStar™, AICE™, AICE-MCU™, AICE-MINI™, Andes Custom Extension™, and COPILOT™ are trademarks owned by Andes Technology Corporation. All other trademarks used herein are the property of their respective owners. This document contains confidential information of Andes Technology Corporation. Use of this copyright notice is precautionary and does not imply publication or disclosure. Neither the whole nor part of the information contained herein may be reproduced, transmitted, transcribed, stored in a retrieval system, or translated into any language in any form by any means without the written permission of Andes Technology Corporation. The product described herein is subject to continuous development and improvement; information herein is given by Andes in good faith but without warranties. This document is intended only to assist the reader in the use of the product. Andes Technology Corporation shall not be liable for any loss or damage arising from the use of any information in this document, or any incorrect use of the product.
Contact Information Should you have any problems with the information contained herein, please contact Andes Technology Corporation by email
[email protected] or online website https://es.andestech.com/eservice/ for support giving:
the document title
the document number
the page number(s) to which your comments apply
a concise explanation of the problem
General suggestions for improvements are welcome.
Instruction Summary for "P" ISA Extension Proposal
Revision History Rev.
Revision Date
0.1
2017/11/20
Revised
Revised Content
Chapter-Section
All
Initial release
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page ii
Instruction Summary for "P" ISA Extension Proposal
Table of Contents COPYRIGHT NOTICE ......................................................................................................................................................... I CONTACT INFORMATION ............................................................................................................................................... I REVISION HISTORY ........................................................................................................................................................ II TABLE OF CONTENTS .................................................................................................................................................... III LIST OF TABLES ................................................................................................................................................................ V 1.
2.
INTRODUCTION ........................................................................................................................................................ 1 1.1.
SIMD INSTRUCTIONS .............................................................................................................................................. 1
1.2.
NON-SIMD INSTRUCTIONS ..................................................................................................................................... 2
1.3.
ZERO-OVERHEAD LOOP MECHANISM ...................................................................................................................... 2
DSP ISA EXTENSION INSTRUCTION SUMMARY ........................................................................................ 4 2.1.
SHORTHAND DEFINITIONS ...................................................................................................................................... 4
2.2.
SIMD DATA PROCESSING INSTRUCTIONS ............................................................................................................... 5
2.2.1.
16-bit Addition & Subtraction Instructions ................................................................................................... 5
2.2.2.
8-bit Addition & Subtraction Instructions .................................................................................................... 7
2.2.3.
16-bit Shift Instructions ................................................................................................................................... 8
2.2.4.
16-bit Compare Instructions ........................................................................................................................... 9
2.2.5.
8-bit Compare Instructions .......................................................................................................................... 10
2.2.6.
16-bit Misc Instructions ................................................................................................................................. 10
2.2.7.
8-bit Misc Instructions .................................................................................................................................. 12
2.2.8.
8-bit Unpacking Instructions ....................................................................................................................... 12
2.3.
NON-SIMD DATA PROCESSING INSTRUCTIONS .................................................................................................... 14
2.3.1.
32-bit Addition/Subtraction Instructions ................................................................................................... 14
2.3.2.
32-bit Shift Instructions ................................................................................................................................ 14
2.3.3.
16-bit Packing Instructions ........................................................................................................................... 15
2.3.4.
Most Significant Word “32x32” Multiply & Add Instructions .................................................................. 15
2.3.5.
Most Significant Word “32x16” Multiply & Add Instructions ................................................................... 16
2.3.6.
Signed 16-bit Multiply with 32-bit Add/Subtract Instructions ................................................................ 17
2.3.7.
Signed 16-bit Multiply with 64-bit Add/Subtract Instructions ................................................................ 19
2.3.8.
Miscellaneous Instructions ........................................................................................................................... 20
2.3.9.
Q31 saturation Instructions .......................................................................................................................... 21
2.3.10.
Q15 saturation instructions ...................................................................................................................... 22
2.3.11.
Overflow status manipulation instructions ........................................................................................... 23
2.4. 2.4.1.
64-BIT INSTRUCTIONS ........................................................................................................................................... 24 64-bit Addition & Subtraction Instructions ................................................................................................ 24
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page iii
Instruction Summary for "P" ISA Extension Proposal 2.4.2.
32-bit Multiply with 64-bit Add/Subtract Instructions............................................................................. 26
2.4.3.
Signed 16-bit Multiply with 64-bit Add/Subtract Instructions ................................................................ 27
2.5. 3.
ZERO-OVERHEAD LOOP (ZOL) MECHANISM INSTRUCTIONS ................................................................................ 30
USER-MODE CSR REGISTERS ........................................................................................................................... 30 3.1.
LOOP BEGIN REGISTER .......................................................................................................................................... 30
3.2.
LOOP END REGISTER ............................................................................................................................................. 31
3.3.
LOOP COUNT REGISTER ......................................................................................................................................... 32
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page iv
Instruction Summary for "P" ISA Extension Proposal
List of Tables TABLE 1. SIMD 16-BIT ADD/SUBTRACT INSTRUCTIONS ............................................................................................................ 5 TABLE 2. SIMD 8-BIT ADD/SUBTRACT INSTRUCTIONS ............................................................................................................. 7 TABLE 3. SIMD 16-BIT SHIFT INSTRUCTIONS ............................................................................................................................ 8 TABLE 4. SIMD 16-BIT COMPARE INSTRUCTIONS ...................................................................................................................... 9 TABLE 5. SIMD 8-BIT COMPARE INSTRUCTIONS ..................................................................................................................... 10 TABLE 6. SIMD 16-BIT MISCELLANEOUS INSTRUCTIONS ........................................................................................................ 10 TABLE 7. SIMD 8-BIT MISCELLANEOUS INSTRUCTIONS .......................................................................................................... 12 TABLE 8. 8-BIT UNPACKING INSTRUCTIONS ............................................................................................................................. 12 TABLE 9. 32-BIT ADD/SUB INSTRUCTIONS .............................................................................................................................. 14 TABLE 10. 32-BIT SHIFT INSTRUCTIONS .................................................................................................................................. 14 TABLE 11. 16-BIT PACKING INSTRUCTIONS ............................................................................................................................... 15 TABLE 12. SIGNED MSW 32X32 MULTIPLY AND ADD INSTRUCTIONS ..................................................................................... 15 TABLE 13. SIGNED MSW 32X16 MULTIPLY AND ADD INSTRUCTIONS ..................................................................................... 16 TABLE 14. SIGNED 16-BIT MULTIPLY 32-BIT ADD/SUBTRACT INSTRUCTIONS ......................................................................... 17 TABLE 15. SIGNED 16-BIT MULTIPLY 64-BIT ADD/SUBTRACT INSTRUCTIONS ......................................................................... 19 TABLE 16. MISCELLANEOUS INSTRUCTIONS ............................................................................................................................. 20 TABLE 17. Q31 SATURATION ALU INSTRUCTIONS .................................................................................................................... 21 TABLE 18. Q15 SATURATION ALU INSTRUCTIONS .................................................................................................................... 22 TABLE 19. OV (OVERFLOW) FLAG SET/CLEAR INSTRUCTIONS................................................................................................. 23 TABLE 20. 64-BIT ADD/SUBTRACT INSTRUCTIONS .................................................................................................................. 24 TABLE 21. 32-BIT MULTIPLY 64-BIT ADD/SUBTRACT INSTRUCTIONS ...................................................................................... 26 TABLE 22. SIGNED 16-BIT MULTIPLY 64-BIT ADD/SUBTRACT INSTRUCTIONS ........................................................................ 27 TABLE 23. ZOL MECHANISM INSTRUCTIONS ........................................................................................................................... 30
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page v
Instruction Summary for "P" ISA Extension Proposal
Typographical Convention Index Document Element
Font
Font Style
Size
Color
Normal text
Georgia
Normal
12
Black
Command line,
Lucida Console
Normal
11
Indigo
LUCIDA CONSOLE BOLD + ALL-CAPS 11
INDIGO
Note or warning
Georgia
Normal
12
Red
Hyperlink
Georgia
Underlined
12
Blue
source code or file paths VARIABLES OR PARAMETERS IN COMMAND LINE, SOURCE CODE OR FILE PATHS
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page vi
Instruction Summary for "P" ISA Extension Proposal
1. Introduction Digital Signal Processing (DSP), has emerged as an important technology for modern electronic systems. A wide range of modern applications employ DSP algorithms to solve problems in their particular domains, including sensor fusion, servo motor control, audio decode/encode, speech synthesis and coding, MPEG4 decode, medical imaging, computer vision, embedded control, robotics, human interface, etc. The AndeStar™ DSP instruction set extension increases the DSP algorithm processing capabilities of the AndesCore™ CPU IP products. With the addition of the AndeStar™ DSP instruction set extension, the AndesCore CPUs can now run these various DSP applications with lower power and higher performance. This DSP instruction set extension adds 8-bits and 16-bits SIMD instructions to increase the throughput of 8-bits and 16-bits DSP computations, so more work can be done in a fixed time slot or a task can be completed faster. It also adds enhanced 16-bits, 32-bits, 64-bits non-SIMD instructions to speed up frequent operations in DSP algorithms. To reduce the looping overhead of a repeated performance-critical DSP computation, this extension also includes a hardware zero-overhead loop mechanism.
1.1.
SIMD Instructions
Using the AndeStar V5 baseline 32-bit registers, we can perform four 8-bit operations or two 16-bit operations in parallel to maximize the throughput of these 8-bit and 16-bit compuations. And there are many DSP applications that can benefit from this performance feature. Therefore, this DSP instruction set extension adds many 8-bit and 16-bit SIMD instructions. The 8-bit SIMD instructions include a variety of signed/unsigned addition and subtraction operations, signed/unsigned comparison operations, signed/unsigned maximum and minimum operations, signed/unsigned unpacking operations, and signed absolute value operation.
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 1
Instruction Summary for "P" ISA Extension Proposal The 16-bit SIMD instructions include a variety of signed/unsigned addition and subtraction operations, different types of shift operations, signed/unsigned comparison operations, signed/unsigned maximum and minimum operations, signed/unsigned multiplication operations, signed/unsigned clipping operations, and signed absolute value operation.
1.2.
Non-SIMD Instructions
The non-SIMD instructions in this DSP extension include 16-bit packing operations, Q15 and Q31 saturating addition, subtraction, multiplication operations, 32-bit signed/unsigned halving addition and subtraction operations, 32-bit saturating left shift and rounding right shift operations, most significant word “32x32 multiply & add” operations, most signification word “32x16 multiply & add” operations, a variety of 32-bit accumulation or subtraction with 16-bit multiplication operations, and bit reverse, bit-wise selection, byte insertion, 32-bit word extraction from 64-bit data operations. To speed up 64-bit operations in DSP applications, this extension also includes a variety of 64-bit addition and subtraction operations, signed/unsigned 64-bit accumulation or subtraction with 32-bit multiplication operations, and signed 64-bit accumulation or subtraction with 16-bit multiplication operations.
1.3.
Zero-overhead Loop Mechanism
A set of Zero-Overhead Loop mechanism is provided to reduce the instruction fetch and execution overhead of loop-control instructions. Three user-mode CSR registers are provided to support this mechanism.
LB: stores the starting address of a loop. It is 32-bit. It can be written with “MTLBI” instruction.
LE: stores the ending address of a loop. It is 32-bit. The value of LE should be greater than or equal to LB. If this rule is violated, UPREDICTABLE behavior will happen. It can be written with “MTLEI” instruction.
LC: contains the loop count number that the zero-overhead looping operation will be performed. It is 32-bit. When LC is greater than 1, any execution of an instruction in an
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 2
Instruction Summary for "P" ISA Extension Proposal address that matches the value of LE will cause the Program Counter to change value to the content of LB and will decrement the value of LC by 1. When LC is less than or equal to 1, the zero-overhead loop mechanism will be turned off. It is used for any loop that needs to be executed at least once. For example, do { ......... } until (count > 4);
The zero-overhead looping operation can be summarized as follows: If ((LC > 1) && (PC == LE)) { LC = LC – 1; PC = LB; }
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 3
Instruction Summary for "P" ISA Extension Proposal
2. DSP ISA Extension Instruction Summary 2.1.
Shorthand Definitions
r.H == rH1 r[31:16], r.L == r.H0 r[15:0]
r.B3 r[31:24], r.B2 r[23:16], r.B1 r[15:8], r.B0 r[7:0]
r[xU] the upper 32-bit of a 64-bit number; xU represents the GPR number that contains this upper part 32-bit value.
r[xL] the lower 32-bit of a 64-bit number; xL represents the GPR number that contains this lower part 32-bit value.
r[xU].r[xL] a 64-bit number that is formed from a pair of GPRs.
s>> signed arithmetic right shift
u>> unsigned logical right shift
SAT.Qn() Saturate to the range of [-2n, 2n-1], if saturation happens, set PSW.OV.
SAT.Um() Saturate to the range of [0, 2m-1], if saturation happens, set PSW.OV.
RUND() Indicate “rounding”, i.e., add 1 to the most significant discarded bit for right shift or MSW-type multiplication instructions.
Sign or Zero Extending functions:
SEm(data) Sign-Extend data to m-bit.
ZEm(data) Zero-Extend data to m-bit.
ABS(x) Calculate the absolute value of “x”.
CONCAT(x,y) Concatinate “x” and “y” to form a value.
u< Unsinged less than comparison.
u Unsinged greater than comparison.
s* Signed multiplication.
u* Unsigned multiplication.
rt is Rd in RISC-V ISA terminology.
ra is Rs1 and rb is Rs2 in RISC-V ISA terminology.
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 4
Instruction Summary for "P" ISA Extension Proposal 2.2.
SIMD Data Processing Instructions
2.2.1.
16-bit Addition & Subtraction Instructions
The SIMD 16-bit add/subtract instructions support 4 types of operations: Addition (two 16-bit additions), Subtraction (two 16-bit subtractions), Crossed Add & Sub, and Crossed Sub & Add. The overflow handling of these instructions can have 5 variations: Wrap-around (dropping overflow), Signed Halving (keeping overflow by dropping 1 LSB bit), Unsigned Halving, Signed Saturation (clipping overflow), and Unsigned Saturation. Together, there are 20 SIMD 16-bit add/subtract instructions. Table 1. SIMD 16-bit Add/Subtract Instructions Mnemonic
Instruction
Operation rt.Hx = ra.Hx + rb.Hx;
ADD16 rt, ra, rb
16-bit Addition
RADD16 rt, ra, rb
16-bit Signed Halving Addition
rt.Hx = (ra.Hx + rb.Hx) s>> 1; (x=1..0)
URADD16 rt, ra, rb
16-bit Unsigned Halving Addition
rt.Hx = (ra.Hx + rb.Hx) u>> 1; (x=1..0)
KADD16 rt, ra, rb
16-bit Signed Saturating Addition
UKADD16 rt, ra, rb
16-bit Unsigned Saturating Addition
SUB16 rt, ra, rb
16-bit Subtraction
RSUB16 rt, ra, rb
16-bit Signed Halving Subtraction
(x=1..0)
rt.Hx = SAT.Q15(ra.Hx + rb.Hx); (x=1..0) rt.Hx = SAT.U16(ra.Hx + rb.Hx); (x=1..0) rt.Hx = ra.Hx - rb.Hx; (x=1..0) rt.Hx = (ra.Hx - rb.Hx) s>> 1; (x=1..0)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 5
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
URSUB16 rt, ra, rb
16-bit Unsigned Halving Subtraction
KSUB16 rt, ra, rb
16-bit Signed Saturating Subtraction
UKSUB16 rt, ra, rb
16-bit Unsigned Saturating Subtraction
CRAS16 rt, ra, rb
16-bit Cross Add & Sub
RCRAS16 rt, ra, rb
16-bit Signed Halving Cross Add & Sub
URCRAS16 rt, ra, rb
KCRAS16 rt, ra, rb
UKCRAS16 rt, ra, rb
rt.Hx = SAT.Q15(ra.Hx - rb.Hx); (x=1..0) rt.Hx = SAT.U16(ra.Hx - rb.Hx); (x=1..0) rt.H = ra.H + rb.L; rt.L = ra.L – rb.H; rt.H = (ra.H + rb.L) s>> 1; rt.L = (ra.L – rb.H) s>> 1;
Sub
rt.L = (ra.L – rb.H) u>> 1;
16-bit Signed Saturating Cross Add &
rt.H = SAT.Q15(ra.H + rb.L);
Sub
rt.L = SAT.Q15(ra.L – rb.H);
16-bit Unsigned Saturating Cross Add
rt.H = SAT.U16(ra.H + rb.L);
& Sub
rt.L = SAT.U16(ra.L – rb.H);
RCRSA16 rt, ra, rb
16-bit Signed Halving Cross Sub & Add
UKCRSA16 rt, ra, rb
(x=1..0)
rt.H = (ra.H + rb.L) u>> 1;
16-bit Cross Sub & Add
KCRSA16 rt, ra, rb
rt.Hx = (ra.Hx - rb.Hx) u>> 1;
16-bit Unsigned Halving Cross Add &
CRSA16 rt, ra, rb
URCRSA16 rt, ra, rb
Operation
rt.H = ra.H - rb.L; rt.L = ra.L + rb.H; rt.H = (ra.H - rb.L) s>> 1; rt.L = (ra.L + rb.H) s>> 1;
16-bit Unsigned Halving Cross Sub &
rt.H = (ra.H - rb.L) u>> 1;
Add
rt.L = (ra.L + rb.H) u>> 1;
16-bit Signed Saturating Cross Sub &
rt.H = SAT.Q15(ra.H - rb.L);
Add
rt.L = SAT.Q15(ra.L + rb.H);
16-bit Unsigned Saturating Cross Sub
rt.H = SAT.U16(ra.H - rb.L);
& Add
rt.L = SAT.U16(ra.L + rb.H);
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 6
Instruction Summary for "P" ISA Extension Proposal 2.2.2.
8-bit Addition & Subtraction Instructions
The SIMD 8-bit add/subtract instructions support 2 types of operations: Addition (four 8-bit additions), Subtraction (four 8-bit subtractions). The overflow handling of these instructions can have 5 variations: Wrap-around (dropping overflow), Signed Halving (keeping overflow by dropping 1 LSB bit), Unsigned Halving, Signed Saturation (clipping overflow), and Unsigned Saturation. Together, there are 10 SIMD 8-bit add/subtract instructions. Table 2. SIMD 8-bit Add/Subtract Instructions Mnemonic
Instruction
ADD8 rt, ra, rb
8-bit Addition
RADD8 rt, ra, rb
8-bit Signed Halving Addition
URADD8 rt, ra, rb
8-bit Unsigned Halving Addition
KADD8 rt, ra, rb
8-bit Signed Saturating Addition
UKADD8 rt, ra, rb
8-bit Unsigned Saturating Addition
SUB8 rt, ra, rb
8-bit Subtraction
RSUB8 rt, ra, rb
8-bit Signed Halving Subtraction
URSUB8 rt, ra, rb
8-bit Unsigned Halving Subtraction
Operation rt.Bx = ra.Bx + rb.Bx; (x=3..0) rt.Bx = (ra.Bx + rb.Bx) s>> 1; (x=3..0) rt.Bx = (ra.Bx + rb.Bx) u>> 1; (x=3..0) rt.Bx = SAT.Q7(ra.Bx + rb.Bx); (x=3..0) rt.Bx = SAT.U8(ra.Bx + rb.Bx); (x=3..0) rt.Bx = ra.Bx - rb.Bx; (x=3..0) rt.Bx = (ra.Bx - rb.Bx) s>> 1; (x=3..0) rt.Bx = (ra.Bx - rb.Bx) u>> 1; (x=3..0)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 7
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
KSUB8 rt, ra, rb
8-bit Signed Saturating Subtraction
UKSUB8 rt, ra, rb
8-bit Unsigned Saturating Subtraction
2.2.3.
Operation rt.Bx = SAT.Q7(ra.Bx - rb.Bx); (x=3..0) rt.Bx = SAT.U8(ra.Bx - rb.Bx); (x=3..0)
16-bit Shift Instructions
There are 13 instructions here. Table 3. SIMD 16-bit Shift Instructions Mnemonic SRA16 rt, ra, rb
SRAI16 rt, ra, im4u
SRA16.u rt, ra, rb
SRAI16.u rt, ra, im4u
Instruction 16-bit Shift Right Arithmetic
(x=1..0) rt.Hx = ra.Hx s>> im4u;
Immediate
(x=1..0)
16-bit Rounding Shift Right Arithmetic
rt.Hx = RUND(ra.Hx s>> rb[3:0]); (x=1..0)
16-bit Rounding Shift Right Arithmetic
rt.Hx = RUND(ra.Hx s>> im4u);
Immediate
(x=1..0)
16-bit Shift Right Logical
SRLI16 rt, ra, im4u
16-bit Shift Right Logical Immediate
SRL16.u rt, ra, rb
16-bit Rounding Shift Right Logical
SLL16 rt, ra, rb
rt.Hx = ra.Hx s>> rb[3:0];
16-bit Shift Right Arithmetic
SRL16 rt, ra, rb
SRLI16.u rt, ra, im4u
Operation
rt.Hx = ra.Hx u>> rb[3:0]; (x=1..0) rt.Hx = ra.Hx u>> im4u; (x=1..0) rt.Hx = RUND(ra.Hx u>> rb[3:0]); (x=1..0)
16-bit Rounding Shift Right Logical
rt.Hx = RUND(ra.Hx u>> im4u);
Immediate
(x=1..0)
16-bit Shift Left Logical
rt.Hx = ra.Hx 0) rt.Hx = SAT.Q15(ra.Hx > -rb[4:0]);
Saturation & Rounding Shift Right
if (rb[4:0] > 0)
Arithmetic
rt.Hx = SAT.Q15(ra.Hx >
KHMX16 rt, ra, rb
16-bit Crossed Signed Multiply
15); (x,y)=(1,0), (0,1)
SMUL16 rt, ra, rb
SMULX16 rt, ra, rb
UMUL16 rt, ra, rb
UMULX16 rt, ra, rb
KABS16 rt, ra
16-bit Signed Multiply to 32-bit
r[tU] = ra.H1 s* rb.H1; r[tL] = ra.H0 s* rb.H0;
16-bit Signed Crossed Multiply to
r[tU] = ra.H1 s* rb.H0;
32-bit
r[tL] = ra.H0 s* rb.H1;
16-bit Unsigned Multiply to 32-bit
r[tU] = ra.H1 u* rb.H1; r[tL] = ra.H0 u* rb.H0;
16-bit Unsigned Crossed Multiply to
r[tU] = ra.H1 u* rb.H0;
32-bit
r[tL] = ra.H0 u* rb.H1;
16-bit Absolute Value
rt.Hx = SAT.Q15(ABS(ra.Hx)); (x=1..0)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 11
Instruction Summary for "P" ISA Extension Proposal 2.2.7.
8-bit Misc Instructions
There are 5 instructions here. Table 7. SIMD 8-bit Miscellaneous Instructions Mnemonic
Instruction
SMIN8 rt, ra, rb
8-bit Signed Minimum
UMIN8 rt, ra, rb
8-bit Unsigned Minimum
SMAX8 rt, ra, rb
8-bit Signed Maximum
UMAX8 rt, ra, rb
8-bit Unsigned Maximum
KABS8 rt, ra
8-bit Absolute Value
2.2.8.
Operation rt.Bx = (ra.Bx < rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = (ra.Bx u< rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = (ra.Bx > rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = (ra.Bx u> rb.Bx)? ra.Bx : rb.Bx; (x=3..0) rt.Bx = SAT.Q7(ABS(ra.Bx)); (x=3..0)
8-bit Unpacking Instructions
There are 8 instructions here. Table 8. 8-bit Unpacking Instructions Mnemonic
Instruction
SUNPKD810 rt, ra
Signed Unpacking Bytes 1 & 0
SUNPKD820 rt, ra
Signed Unpacking Bytes 2 & 0
SUNPKD830 rt, ra
Signed Unpacking Bytes 3 & 0
SUNPKD831 rt, ra
Signed Unpacking Bytes 3 & 1
ZUNPKD810 rt, ra
Unsigned Unpacking Bytes 1 & 0
Operation rt.Hx = SE16(ra.By); (x,y) = (1,1), (0,0) rt.Hx = SE16(ra.By); (x,y) = (1,2), (0,0) rt.Hx = SE16(ra.By); (x,y) = (1,3), (0,0) rt.Hx = SE16(ra.By); (x,y) = (1,3), (0,1) rt.Hx = ZE16(ra.By); (x,y) = (1,1), (0,0)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 12
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
ZUNPKD820 rt, ra
Unsigned Unpacking Bytes 2 & 0
ZUNPKD830 rt, ra
Unsigned Unpacking Bytes 3 & 0
ZUNPKD831 rt, ra
Unsigned Unpacking Bytes 3 & 1
Operation rt.Hx = ZE16(ra.By); (x,y) = (1,2), (0,0) rt.Hx = ZE16(ra.By); (x,y) = (1,3), (0,0) rt.Hx = ZE16(ra.By); (x,y) = (1,3), (0,1)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 13
Instruction Summary for "P" ISA Extension Proposal 2.3.
Non-SIMD Data Processing Instructions
2.3.1.
32-bit Addition/Subtraction Instructions
There are 4 instructions here. Table 9. 32-bit Add/Sub Instructions Mnemonic
Instruction
Operation
RADDW rt, ra, rb
32-bit Signed Halving Addition
rt = (ra + rb) s>> 1
URADDW rt, ra, rb
32-bit Unsigned Halving Addition
rt = (ra + rb) u>> 1
RSUBW rt, ra, rb
32-bit Signed Halving Subtraction
rt = (ra - rb) s>> 1
URSUBW rt, ra, rb
32-bit Unsigned Halving Subtraction
rt = (ra - rb) u>> 1
2.3.2.
32-bit Shift Instructions
There are 5 instructions here. Table 10. 32-bit Shift Instructions Mnemonic SRA.u rt, ra, rb SRAI.u rt, ra, imm5u
KSLL rt, ra, rb
KSLLI rt, ra, imm5u
Instruction Rounding Shift Right Arithmetic Rounding Shift Right Arithmetic Immediate Saturating Shift Left Logical Saturating Shift Left Logical Immediate
Operation rt = RUND(ra s>> rb[4:0]) rt = RUND(ra s>> imm5u) rt = SAT.Q31(ra SAT.U5(-rb[7:0]));
Rounding Shift Right Arithmetic
if (rb[7:0] > 0) rt = SAT.Q31(ra 2 imm5u -1) {
UCLIP32 Rd, Rs1, imm5u
Clip Value
Rd=2 imm5u -1; OV=1; } else if (Rs1 < 0) { Rd=0; OV=1; } } else { Rd=Rs1; }
SCLIP32 Rd, Rs1, imm5u
Clip Value Signed
If (Rs1 > 2 imm5u -1) { Rd=2 imm5u -1; OV=1;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 20
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation } else if (Rs1 < -2 imm5u ) { Rd=-2 imm5u; OV=1; } else { Rd=Rs1; } Rd =
CLZ Rd, Rs1
Count leading zero
CLO Rd, Rs1
Count leading one
MAX Rd, Rs1, Rs2
Return the larger signed value
Rd = signed-max(Rs1, Rs2)
MIN Rd, Rs1, Rs2
Return the smaller signed value
Rd = signed-min (Rs1, Rs2)
AVE Rd, Rs1, Rs1
Average two signed integers with rounding
COUNT_ZERO_FROM_MSB(Rs1) Rd = COUNT_ONE_FROM_MSB(Rs1)
Rd = (Rs1 + Rs2 + 1) (arith) >> 1 a = ABS(Rs1(7,0) – Rs2(7,0));
PBSAD Rd, Rs1, Rs1
Parallel Byte Sum of Absolute Difference
b = ABS(Rs1(15,8) – Rs2(15,8)); c = ABS(Rs1(23,16) – Rs2(23,16)); d = ABS(Rs1(31,24) – Rs2(31,24)); Rd = a + b + c + d; a = ABS(Rs1(7,0) – Rs2(7,0));
PBSADA Rd, Rs1, Rs1
Parallel Byte Sum of Absolute Difference Accumulate
b = ABS(Rs1(15,8) – Rs2(15,8)); c = ABS(Rs1(23,16) – Rs2(23,16)); d = ABS(Rs1(31,24) – Rs2(31,24)); Rd = Rd + a + b + c + d;
2.3.9.
Q31 saturation Instructions
The following table lists instructions related to Q31 arithmetic. Table 17. Q31 saturation ALU Instructions Mnemonic
Instruction
Operation
KADDW Rt, Ra, Rb
Add with Q31 saturation.
Rt = SAT.Q31(Ra + Rb)
KSUBW
Subtract with Q31 saturation.
Rt = SAT.Q31(Ra – Rb)
Rt, Ra, Rb
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 21
Instruction Summary for "P" ISA Extension Proposal Mnemonic
KSLRAW Rt, Ra, Rb
Instruction Logical left shift or arithmetic right shift with Q31 saturation.
Operation (Rb[7:0] >=0) ? Rt = SAT.Q31(Ra > -Rb[7:0])
Multiply Q15 numbers with Q31 KDMBB Rt, Ra, Rb
saturation in bottom parts of two
Rt = SAT.Q31(Ra.H0 * Rb.H0)
registers. Multiply Q15 numbers with Q31 KDMTB Rt, Ra, Rb
saturation in top and bottom parts of
Rt = SAT.Q31(Ra.H1 * Rb.H0)
two registers. Multiply Q15 numbers with Q31 KDMBT Rt, Ra, Rb
saturation in bottom and top parts of
Rt = SAT.Q31(Ra.H0 * Rb.H1)
two registers. KDMTT Rt, Ra, Rb
Multiply Q15 numbers with Q31 saturation in top parts of two registers.
Rt = SAT.Q31(Ra.H1 * Rb.H1)
2.3.10. Q15 saturation instructions The following table lists instructions related to Q15 arithmetic. Table 18. Q15 saturation ALU Instructions Mnemonic
Instruction
Operation
KADDH Rt, Ra, Rb
Add with Q15 saturation.
Rt = SAT.Q15(Ra + Rb)
KSUBH
Subtract with Q15 saturation
Rt = SAT.Q15(Ra – Rb)
Rt, Ra, Rb
Multiply Q15 numbers in bottom parts KHMBB Rt, Ra, Rb
of two registers and extract high part with Q15 saturation.
Rt = SAT.Q15((Ra.H0 * Rb.H0) s>> 15)
Multiply Q15 numbers in top and KHMTB Rt, Ra, Rb
bottom parts of two registers and
Rt = SAT.Q15((Ra.H1 * Rb.H0) s>> 15)
extract high part with Q15 saturation. KHMBT Rt, Ra, Rb
Multiply Q15 numbers in bottom and top parts of two registers and extract
Rt = SAT.Q15((Ra.H0 * Rb.H1) s>> 15)
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 22
Instruction Summary for "P" ISA Extension Proposal high part with Q15 saturation. Multiply Q15 numbers in top parts of KHMTT Rt, Ra, Rb
two registers and extract high part with
Rt = SAT.Q15((Ra.H1 * Rb.H1) s>> 15)
Q15 saturation.
2.3.11. Overflow status manipulation instructions The following table lists the user instructions related to Overflow (OV) flag manipulation. Table 19. OV (Overflow) flag Set/Clear Instructions Mnemonic
Instruction
Operation
RDOV Rt
Read mxstatus.OV to Rt.
Rt = ZE32(mxstatus.OV)
CLROV
Clear mxstatus.OV flag.
mxstatus.OV = 0
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 23
Instruction Summary for "P" ISA Extension Proposal 2.4.
64-bit Instructions
2.4.1.
64-bit Addition & Subtraction Instructions Table 20. 64-bit Add/Subtract Instructions Mnemonic
Instruction
Operation a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
ADD64 rt, ra, rb
64-bit Addition
t64 = a64 + b64;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
RADD64 rt, ra, rb
64-bit Signed Halving Addition
t64 = (a64 + b64) s>>1;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
URADD64 rt, ra, rb
64-bit Unsigned Halving Addition
t64 = (a64 + b64) u>>1;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
KADD64 rt, ra, rb
64-bit Signed Saturating Addition
t64 = SAT.Q63(a64 + b64);
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
UKADD64 rt, ra, rb
64-bit Unsigned Saturating Addition
t64 = SAT.U64(a64 + b64);
r[tU].r[tL] = t64;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 24
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
SUB64 rt, ra, rb
64-bit Subtraction
t64 = a64 - b64;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
RSUB64 rt, ra, rb
64-bit Signed Halving Subtraction
t64 = (a64 - b64) s>>1;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
URSUB64 rt, ra, rb
64-bit Unsigned Halving Subtraction
t64 = (a64 - b64) u>>1;
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
KSUB64 rt, ra, rb
64-bit Signed Saturating Subtraction
t64 = SAT.Q63(a64 - b64);
r[tU].r[tL] = t64; a64 = r[aU].r[aL]; b64 = r[bU].r[bL];
UKSUB64 rt, ra, rb
64-bit Unsigned Saturating Subtraction
t64 = SAT.U64(a64 - b64);
r[tU].r[tL] = t64;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 25
Instruction Summary for "P" ISA Extension Proposal 2.4.2.
32-bit Multiply with 64-bit Add/Subtract Instructions Table 21. 32-bit Multiply 64-bit Add/Subtract Instructions Mnemonic
Instruction
Operation c64 = r[tU].r[tL];
SMAR64 rt, ra, rb
32x32 with 64-bit Signed Addition
t64 = c64 + ra*rb; // signed
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
SMSR64 rt, ra, rb
32x32 with 64-bit Signed Subtraction
t64 = c64 - ra*rb; // signed
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
UMAR64 rt, ra, rb
32x32 with 64-bit Unsigned Addition
t64 = c64 + ra*rb; // unsigned
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
UMSR64 rt, ra, rb
32x32 with 64-bit Unsigned Subtraction
t64 = c64 - ra*rb; // unsigned
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
KMAR64 rt, ra, rb
32x32 with Saturating 64-bit Signed Addition
t64 = SAT.Q63(c64 + ra*rb);
r[tU].r[tL] = t64;
KMSR64 rt, ra, rb
32x32 with Saturating 64-bit Signed Subtraction
c64 = r[tU].r[tL];
t64 = SAT.Q63(c64 – ra*rb);
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 26
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
UKMAR64 rt, ra, rb
32x32 with Saturating 64-bit Unsigned Addition
t64 = SAT.U64(c64 + ra*rb);
r[tU].r[tL] = t64; c64 = r[tU].r[tL];
UKMSR64 rt, ra, rb
32x32 with Saturating 64-bit Unsigned Subtraction
t64 = SAT.U64(c64 - ra*rb);
r[tU].r[tL] = t64;
2.4.3.
Signed 16-bit Multiply with 64-bit Add/Subtract Instructions Table 22. Signed 16-bit Multiply 64-bit Add/Subtract Instructions Mnemonic
Instruction
Operation c64 = r[tU].r[tL];
“Bottom 16 x Bottom 16” with 64-bit SMALBB rt, ra, rb
Signed Addition
t64 = c64 + ra.L*rb.L;
(64 = 64 + 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; “Bottom 16 x Top 16” with 64-bit SMALBT rt, ra, rb
Signed Addition
t64 = c64 + ra.L*rb.H;
(64 = 64 + 16x16) r[tU].r[tL] = t64; “Top 16 x Top 16” with 64-bit Signed SMALTT rt, ra, rb
Addition (64 = 64 + 16x16)
c64 = r[tU].r[tL];
t64 = c64 + ra.H*rb.H;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 27
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation r[tU].r[tL] = t64;
c64 = r[tU].r[tL]; Two “16x16” with 64-bit Signed Double SMALDA rt, ra, rb
Addition
t64 = c64 + ra.H*rb.H + ra.L*rb.L;
(64 = 64 + 16x16 + 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two Crossed “16x16” with 64-bit SMALXDA rt, ra, rb
Signed Double Addition
t64 = c64 + ra.H*rb.L + ra.L*rb.H;
(64 = 64 + 16x16 + 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two “16x16” with 64-bit Signed SMALDS rt, ra, rb
Addition and Subtraction
t64 = c64 + ra.H*rb.H - ra.L*rb.L;
(64 = 64 + 16x16 - 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two “16x16” with 64-bit Signed SMALDRS rt, ra, rb
Addition and Reversed Subtraction
t64 = c64 + ra.L*rb.L - ra.H*rb.H;
(64 = 64 + 16x16 - 16x16) r[tU].r[tL] = t64; c64 = r[tU].r[tL]; Two Crossed “16x16” with 64-bit SMALXDS rt, ra, rb
Signed Addition and Subtraction
t64 = c64 + ra.H*rb.L - ra.L*rb.H;
(64 = 64 + 16x16 - 16x16) r[tU].r[tL] = t64; Two “16x16” with 64-bit Signed Double SMSLDA rt, ra, rb
Subtraction (64 = 64 - 16x16 - 16x16)
c64 = r[tU].r[tL];
t64 = c64 - ra.H*rb.H - ra.L*rb.L;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 28
Instruction Summary for "P" ISA Extension Proposal Mnemonic
Instruction
Operation r[tU].r[tL] = t64; c64 = r[tU].r[tL];
Two Crossed “16x16” with 64-bit SMSLXDA rt, ra, rb
Signed Double Subtraction
t64 = c64 - ra.H*rb.L - ra.L*rb.H;
(64 = 64 - 16x16 - 16x16) r[tU].r[tL] = t64;
The information contained herein is the exclusive property of Andes Technology Co. and shall not be distributed, reproduced, or disclosed in whole or in part without prior written permission of Andes Technology Corporation. P_DSP_ISA_EXT_Summary.docx
Page 29
Instruction Summary for "P" ISA Extension Proposal 2.5.
Zero-Overhead Loop (ZOL) Mechanism Instructions
The following table lists the instructions in the Zero-Overhead Loop Mechanism. Table 23. ZOL Mechanism Instructions Mnemonic MTLBI imm16s MTLEI imm16s
Instruction Move to Loop Begin register Immediate. Move to Loop End register Immediate.
Operation LB = PC + SE32(imm16s