1 /23 1 /22 i /1 $000000 ICOMPARATORI i

3 downloads 298 Views 2MB Size Report
Apr 22, 2011 - Pat. No. 6,237,084, which is a divisional of application Ser. No. ... of application software that can in
USO0RE43 729E

(19) United States (12) Reissued Patent

(10) Patent Number: US (45) Date of Reissued Patent:

Morikawa et al. (54)

(56)

PROCESSOR WHICH CAN FAVORABLY EXECUTE A ROUNDING PROCESS COMPOSED OF POSITIVE CONVERSION AND SATURATED CALCULATION PROCESSING

RE43,729 E Oct. 9, 2012

References Cited U.S. PATENT DOCUMENTS 4,935,890 A 6/1990 Funyu 4,945,507 A

7/1990 Ishida et a1.

(Continued)

(75) Inventors: Toru MorikaWa, Kadoma (JP); Nobuo

Higaki, Kadoma (JP); Akira Miyoshi,

FOREIGN PATENT DOCUMENTS

Kadoma (JP); Keizo Sumida, Kadoma

EP

0 657 804 A1

(JP)

6/1995

(Continued)

(73) Assignee: Panasonic Corporation, Osaka (JP)

OTHER PUBLICATIONS

(21) App1.No.: 13/092,453

Nadehara, Kouhei, et al., “Low-Power Multimedia RISC”, IEEE

Micro, US, IEEE Inc., NeWYork, v01. 15, N0. 6, (Dec. 1, 1995), pp.

(22) Filed:

Apr. 22, 2011

20-29, XP538227, ISSN: 0272-1732.

Related US. Patent Documents

(Continued)

Reissue of:

(64)

Patent No.:

Primary Examiner * Richard Ellis

6,237,084 May 22, 2001 09/399,577 Sep. 20, 1999

Issued:

Appl. No.: Filed:

(74) Attorney, Agent, or Firm * McDermott Will & Emery LLP

U.S. Applications:

(57)

(62)

21, 2004, noW Pat. No. Re. 43,145, Which is a division

A processor Which executes positive conversion processing, Which converts coded data into uncoded data, and saturation

of application No. 10/366,502, ?led on Feb. 13, 2003,

calculation processing, Which rounds a value to an appropri

noW Pat. No. Re. 39,121, Which is a division of appli cation No. 08/980,676, ?led on Dec. 1, 1997, noW Pat.

ate number of hits, at high speed. When a positive conversion saturation calculation instruction “MCSST D1” is decoded, the sum-product result register 6 outputs its held value to the path P1. The comparator 22 compares the magnitude of the held value of the sum-product result register 6 With the coded

Division of application No. 11/016,920, ?led on Dec.

No. 5,974,540.

(30)

Foreign Application Priority Data

ABSTRACT

32-bit integer “0x0000_00FF”. The polarity judging unit 23 Nov. 29, 1996

(JP) ..................................... .. 8-320423

judges Whether the eighth bit of the value held by the sum product result register 6 is “ON”. The multiplexer 24 outputs one of the maximum value “0x0000_00FF” generated by the

(51)

Int. Cl. G06F 9/302

(52)

US. Cl. ...................................... .. 708/551; 708/552

(58)

Field of Classi?cation Search ................ .. 708/550,

constant generator 21, the Zero value “0x0000_0000” gener ated by the Zero generator 25, and the held value of the sum-product result register 6 to the data bus 18.

708/551, 552, 203, 204, 208 See application ?le for complete search history.

2 Claims, 17 Drawing Sheets

(2006.01)

116

7

3 g"“'nsmvrcoméromsnowman"1 y8 " \E CALCULATION CIRCUIT /c2 /C3

1 /23 1 /22 i $000000 ICOMPARATORI i

a____1 5

J2

/1

i_a

,25

,21

; REGISTERFILE

/B 5 1 ZEROGENERATOR IICONSTANTGENERATORI i :

MGR

1

:I

24

MULTIPLEXER

:

j

:------------------------------------ --:

/ KCl

,04 e

)

/18

s

US RE43,729 E Page 2 US. PATENT DOCUMENTS 5,235,533 A 8/1993 Sweedler . 5,251,166 A 10/1993 Ish1da 5 402 368 A 3/l995 Y d ’



m ‘1

5,448,509 A

9/1995 Lee et a1.

5,504,697 A

4/1996 Ishida

5,508,951 A 5,537,562 A

4/1996 IshlkaWa 7/1996 Gallup et al.

JP JP JP

KR

8-272591 09-97178

10/1996 4/1997

10-55274

2/1998

1995-0010571

9/1995

WO

9617292

6/1996

W0

W0 9617292

6/l996

OTHER PUBLICATIONS

5,684,728 A

11/1997 Okayama et a1~

Lee, RubyB., “Subword Parallelismwith MAX-2”, IEEE Micro, US,

5,696,709 A 5,801,977 A

12/1997 Smlth, 5‘ 9/1998 Karp et al.

IEEE Inc., New York, V01. 16, N0. 1. , (Aug. 1, 1996), pp. 51-59, XP000596513

5,812,439

A

9/1998

Hansen

'

_

_

_

_

5,847,978 A 5,889,980 A 5,915,109 A

12/1998 OguIa 3/1999 Smith, Jr, 6/ 1999 Nakakimura et 31,

Japanese Of?ce Act1on, issued in correspond1ng Japanese Patent Application No. 9-327866, dated on Oct. 11, 2007. Korean Of?ce Action, issued in Corresponding Korean Patent Appli

5,917,740 A 5,974,540 A

6/1999 Volkonsky 10/1999 MOfik?Wa et a1.

cation No. 10-1997-0064288, dated on Feb. 26, 2004. TMS32010 User’s Guide, Digital Signal Processor Products, Texas

6,029,184 A

2/2000 He

6,058,410 A

5/2000 Shamngpanl

_

FOREIGN PATENT DOCUMENTS

Instruments, 1983, pp. 3-7.

Dictionary.com, de?nition of “speci?ed”, http://dictionary.reference. com/browse/ speci?ed, accessed Jun. 16, 2010.

Dictionary.com, de?nition of “de?nitely”, http://dictionaryrefer EP EP

657804 0 766 169 Al

g3

22330000823 A

JP

5866032

JP

JP JP JP JP

58-056032 A

07482141 7-210368 A 7210368 7-334346

6/1995 4/1997

ence.com/broWse/de?nitely, accessed Jun. 16, 2010. Dictionary.com, de?nition of “unambiguosly”, http://dictionaryref

4/1983

Patterson et al., AVLSI RISC, IEEE Computer, 1982, pp. 8-18 and

4/1983

5041'

erence.com/broWse/unambiguously, accessed Jun. 16, 2010. 7 / 1995 8/1995 8/1995 12/1995

.

.

,,

.

Low-Power Mult1med1a RISC, by K. Nadehara, 8207 IEEE M1cro 15 (1995) D99, N9 6 “Subword Parallelism With MAX-2,” by R. Lee, IEEE Micro Aug. 1, 1996, vol. 16, No. 4.

US. Patent

0a. 9, 2012

Sheet 1 0f 17

US RE43,729 E

FIG. 1 PRIOR ART

111

ARITHMETIC LOGIC UNIT

/62 SUM-PRODUCT RESULT REGISTER

US. Patent

OEMNP05?“

0a. 9, 2012

EUmSdHQMNTo;

Sheet 2 0f 17

US RE43,729 E

US. Patent

0a. 9, 2012

Sheet 4 0f 17

M .EaNEZQE

m H

w.05

0\r?mio .2579: .

_

_

- -

(0,

D

-

-

-

_ -

US RE43,729 E

-

_

_

.

_

-

-

l-I

US. Patent

Oct. 9, 2012

US RE43,729 E

Sheet 6 0f 17

FIG. 6 MACCB INSTRUCTION

|||||

_|_: F_____

MULTIPLIER READ ADDRESS

INDICATION

T

]

MULTIPLICAND READ ADDRESS

INDICATION

11' ' ' 'MCR

11- - - -MCR

OO- ' ' ‘REGISTER DO

0O‘ ' - 'REGISTER DO

01' ' ' 'REGISTER D1 10' ' ' ‘REGISTER D2

01 ' ' ' 'REGISTER D1

10- ' ' 'REGISTER D2

—INDICATION OF CONTENT OF ELEMENTAL OPERATION 1' - * ~MULTIPLICATION O- - - -NONE

‘—INDICATION OF CALCULATED CONTENT OF ALGEBRAIC SUM l - ‘ - 'ADDITION

0' ' ' "NONE

INDICATION OF STORAGE ADDRESS FOR SUM-PRODUCT RESULT

1""MCR 0""NONE

US. Patent

0a. 9, 2012

Sheet 7 0f 17

US RE43,729 E

FIG. 7 MCSST INSTRUCTION

POSITIVE CONVERSION STORAGE ADDRESS SATURATION CALCULATION INDICATED WIDTH INDICATION 00"“24bit POSITIVE CONVERSION 00' ' ' ' REGISTER DO 01 ""I?bit POSITIVE CONVERSION OI ' ' ' ' REGISTER D1

1 1 -"'8bilPOSITIVE CONVERSION

10- ' ' 'REGISTER D2 1 1 - ' - 'REGISTER D3

US. Patent

Oct. 9, 2012

Sheet 8 0f 17

US RE43,729 E

FIG. 8A 25 24

1716

llllllllllllllllllllllll 2524

1

8 7

§§§§§§ ~ I

CODE BIT MULTIPLIE?. MULTIELICAND ji F1]

1716

1

9 8

% w \. R

l\\

R I5

n G.1

H 2U I

1WM+

mv/ m1.

wmm n H % nS

“nn“5JRU\w|9.Ow/|muu,. W vmf .

m m w w

/A, m M 6|! 4‘k

?///// /% 1mHM

WWW W HO A M!M| m WWW H

/ uI 7.7

HL2“l

u I WU

US. Patent

0a. 9, 2012

Sheet 9 0f 17

US RE43,729 E

FIG. 9

LOGIC VALUE X LOGIC VALUE Y SELECTED INPUT VALUE 1 0 OXOOOO__OOFF 1 1 OXOO0O_OOOO O 1 0x0O0O_0OOO 0 0 STORED VALUE OF SUM-PRODUCT RESULT REGISTER

US. Patent

Oct. 9, 2012

Sheet 10 0f 17

US RE43,729 E

FIG. 10 EXAMPLE OPERATION: DO >< D1 (Ox7f X 0x70) MEMORY STORED VALUE

0X7f

0X70

REGISTER STORED VALUE D0

D1 32

4

32

/

5

/

CODE EXTENSION CODE EXTENSION CIRCUIT CIRCUIT 0x0000007f

32

,454 OXOOOOOOOOOOOO379O V

6

MCR

OUTPUT OF LOWER-ORDER

32 BITS

#1

‘:32 0x00003592 MSB I O

0x00003790>0x000000ff CIRCUIT

a0X000000ff 32 OXOOOOOOff

REGISTER STORED VALUE

D1 OxOOOOOUff |

MEMORY STORED VALUE

()xff

US. Patent

0a. 9, 2012

Sheet 11 0f 17

US RE43,729 E

FIG. 11 EXAMPLE OPERATION: D0 >< D1(Ox7f >< 0x80) MEMORY STORED REGISTER STORED 32

32

rJ4

/5

CODE EXTENSION CODE EXTENSION‘ CIRCUIT

CIRCUIT

0x0000007f

32

32 Oxffffff‘80

’ 64 OxffffffffffffcOSO \

MCR OUTPUT OF LOWER-ORDER

32 BITS

'16

32 OxffffcO/S/O3

POSITIVE CONVERSION

MSB: 1-»0x00000000

SATURATION CALCULATION CIRCUIT

/32 0x00000000 REGISTER VALUE STORED

V

D1 OXOOOOOOff |

MEMORY STORED VALUE

0X00

US. Patent

0a. 9, 2012

Sheet 14 or 17

US RE43,729 E

FIG. 13 MCSST INSTRUCTION

TORAGE ADDRESS ERIIIIIIEE‘IINXEESRJTATION WIDTH INDICATION INDICATED 11-- ~MCR 00~~24bit POSITIVE CONVERSION 01~~~16bit POSITIVECONVERSION

1 1~~~8bil POSITIVE CONVERSION

—READ ADDRESS INDICATION 11- - - 'MCR

00' ' ' ‘REGISTER D0 01' ' ' ‘REGISTER D1 10' - - 'REGISTER D2

00- - - - REGISTER D0 01- - - ' REGISTER D1 10' ' ' ' REGISTER D2

US. Patent

0a. 9, 2012

Sheet 17 0f 17

US RE43,729 E

FIG. 16 MULBSST INSTRUCTION

MULTIPLIER READ MULTIPLICAND READ ADDRESS INDICATION ADDRESS INDICATION 11----MCR

11~~McR

OO----REGISTERDO

OO~---REGISTERDO

01----REGISTERD1 O1----REGISTERD1 10' -~REG1STERD2

10~ - - REGISTER D2

- POSITIVE CONVERSION SATURATION CALCULATION

WIDTH INDICATION 01' ' ' '24bit POSITIVE VALUE

10' ' ' '16bit POSITIVE VALUE

II' ' ' '8bit POSITIVE VALUE

———-——CALCULATION CONTENT INDICATION I- ' - 'MULTIPLICATION

O‘ ' ' ‘NONE

US RE43,729 E 1

2

PROCESSOR WHICH CAN FAVORABLY EXECUTE A ROUNDING PROCESS COMPOSED OF POSITIVE CONVERSION AND SATURATED CALCULATION PROCESSING

hardware, and is provided with an instruction set that includes a “MOV MCR, * *” transfer instruction for transferring a sum

product value. An example of the hardware construction of a conventional multimedia-oriented processor is shown in FIG. 1. As shown

in FIG. 1, the arithmetic logic unit (hereinafter, “ALU”) 61 performs the multiplication of an element Fij that forms part of the compressed data and an element Gji that forms part of

Matter enclosed in heavy brackets [ ] appears in the original patent but forms no part of this reissue speci?ca

the coe?icient matrix in accordance with a multiplication instruction. The ALU 61 also reads the sum-product value

tion; matter printed in italics indicates the additions made by reissue.

stored in the sum-product result register 62, adds the multi plication result of Gji*Fij to the read sum-product value, and has the result of this addition stored in the sum-product result

[This is a divisional application of US. Ser. No. 08/980, 676 now US. Pat. No. 5,974,540 ?led Dec. 1, 1997.] More than one reissue application has been filedfor the reissue of

register 62. By repeating the above calculation, a sum-prod

(reissued as Re. 39,121 on Jun. 6, 2006) and 11/016,920 (reissued as Re. 43,145 on Jan. 24, 2012), all ofwhich are

uct value is accumulated in the sum-product result register 62. Once the multiplication has been performed a predetermined number of times, the programmer issues a sum-product value transfer instruction. By issuing a transfer instruction, the accumulated value in the sum-product result register 62 is transferred to the general registers, and is used as the matrix

divisional reissues ofU.S. Pat. No. 6,237,084. This applica

multiplication result for one row and one column. By per

tion is a divisional reissue ofapplication Ser. No. 11/016,920, filed on Dec. 21, 2004, now US. Pat. No. Re. 43,145, which is a divisional reissue ofapplication Ser. No. 10/366,502?led Feb. 13, 2003, now US. Pat. No. Re. 39,121, which isa reissue

forming N*N iterations of the above processing, the matrix multiplication of N*N compressed data and an N*N coef? cient matrix can be completed.

of 09/399,577 filed on Sep. 20, 1999, now US. Pat. No. 6,237,084, which is a divisional of application Ser. No. 08/980,676?led Dec. 1, 1997, now US. Pat. No. 5,974,540.

used, however, positive correction saturation operations for

5

US. Pat. No. 6,237,084. The reissue applications are the

present application and application Ser. Nos. 10/366,502

When a conventional multimedia-oriented processor is

amending the sum-product value pose many di?iculties for programmers. Positive conversion processing refers to the conversion of a sum-product value that is a negative value into either Zero or

BACKGROUND OF THE INVENTION

a positive value. Normally, compressed data is expressed as a coded relative value that re?ects the relation of the present value to the preceding and succeeding values. As a result,

1. Field of the Invention The present invention relates to a processor that performs processing according to instruction sequences that are stored in a ROM or the like.

2. Background of the Invention In recent years, there has been a visible increase in the use

of application software that can interactively reproduce vari ous kinds of data, such as video data, still image data, and audio data, that have been compressed according to tech niques such as frame encoding, ?eld encoding, or motion compensation. As such software has been developed, there has been increasing demand for multimedia-oriented proces sors that can ef?ciently execute the software. These multime dia-oriented processors are processors designed with a spe cial architecture to facilitate programming, such as the

there are many cases when the sum of products for each 35

element in the compressed data and the corresponding coef ?cients is a negative value. Most reproduction-related hard ware, such as displays and speakers, however is only able to process uncoded data, so that when the sum-product values are to be reproduced, it is ?rst necessary to perform positive

conversion processing. Saturation calculation processing refers to processing that sets all values that exceed a given range (or, in other words, which are “saturated”) at a predetermined value. This is to say, when an element that includes an erroneous bit generated during transfer is used in a sum-product calculation as part of

compression and decompression of video and audio data. The

the sum-product processing for compressed data, there is an increase in the probability of the sum-product value exceed

high-speed processing required for handling video data is the

ing a value that can be expressed by the stated number of bits.

matrix multiplication of compressed data that has N*N

Since most reproduction-related hardware is only physically

matrix elements with coe?icient data that also has N*N

capable of reproducing uncoded data with a ?xed valid num

matrix elements. Representative examples of compressed

ber of bits, such as eight bits, saturation processing is required

data that has N*N matrix elements are the luminescence

to convert the sum-product value into a value that can be

block composed of 16*16 luminescence elements, the blue color difference block (Cb block) composed of 8*8 color difference elements, and the red color difference block (Cr block) composed of 8*8 color difference elements used in

expressed using the valid number of bits. It has been conventional practice to perform this kind of positive value conversion processing and saturation calcula

tion processing by converting the-sum-product value using a

MPEG (Moving Pictures Experts Group) techniques. The

subroutine that corrects the sum-product value. An example

matrix multiplication for compressed data referred to here is

of a subroutine that corrects the sum-product value is

performed very frequently when executing the approxima

explained below. In this example, the register width and the

tion calculations for an inverse DCT (Discrete Cosine Trans

calculation width of the calculation unit are 32 bits, with the

form) in image compression methods such as MPEG and

width of the MCR being 32 bits, and the sum-product value

JPEG (Joint Photographic Experts Group).

being expressed as a coded 16-bit integer. The data that can be

The following is a description of conventional multimedia oriented processors that can perform high-speed matrix mul tiplication. The basic architecture of conventional multime dia-oriented processors is provided with a sum-product result register (hereinafter simply referred to as an MCR register) as

handled by the reproduction-related hardware needs to be expressed using uncoded 8-bit integers. This subroutine is set as using the data register D0 for storing the calculation result. Each instruction is expressed using two operands, with the

left and right operands being respectively called the ?rst and