LOCAL LINEAR TRANSFORMATION FOR VOICE CONVERSION

1 downloads 0 Views 245KB Size Report
LOCAL LINEAR TRANSFORMATION FOR VOICE CONVERSION. Victor Popa1, Hanna Silen1, Jani Nurminen2 and Moncef Gabbouj1. 1Department of Signal ...
/2&$//,1($575$16)250$7,21)2592,&(&219(56,21    9LFWRU3RSD +DQQD6LOHQ -DQL1XUPLQHQDQG0RQFHI*DEERXM   'HSDUWPHQWRI6LJQDO3URFHVVLQJ7DPSHUH8QLYHUVLW\RI7HFKQRORJ\7DPSHUH)LQODQG  1RNLD7DPSHUH)LQODQG  $%675$&7  0DQ\ SRSXODU DSSURDFKHV WR VSHFWUDO FRQYHUVLRQ LQYROYH OLQHDU WUDQVIRUPDWLRQV GHWHUPLQHG IRU SDUWLFXODU DFRXVWLF FODVVHV DQG FRPSXWH WKH FRQYHUWHG UHVXOW DV D OLQHDU FRPELQDWLRQ EHWZHHQ GLIIHUHQW ORFDO WUDQVIRUPDWLRQV LQ DQ DWWHPSW WR HQVXUH D FRQWLQXRXV FRQYHUVLRQ 7KHVH PHWKRGV RIWHQ SURGXFH RYHUVPRRWKHG VSHFWUD DQG SDUDPHWHU WUDFNV 7KH SURSRVHG PHWKRG FRPSXWHV DQ LQGLYLGXDO OLQHDU WUDQVIRUPDWLRQ IRU HYHU\ IHDWXUH YHFWRU EDVHG RQ D VPDOO QHLJKERUKRRG LQ WKH DFRXVWLF VSDFH WKXV SUHVHUYLQJ ORFDO GHWDLOV 7KH PHWKRG HIIHFWLYHO\ UHGXFHV WKH RYHUVPRRWKLQJ E\ HOLPLQDWLQJ XQGHVLUHG FRQWULEXWLRQV IURP DFRXVWLFDOO\ UHPRWH UHJLRQV 7KH PHWKRG LV HYDOXDWHG LQ OLVWHQLQJ WHVWV DJDLQVW WKH ZHOONQRZQ *DXVVLDQ 0L[WXUH 0RGHO EDVHG FRQYHUVLRQUHSUHVHQWDWLYHRIWKHFODVVRIPHWKRGVLQYROYLQJ OLQHDU WUDQVIRUPDWLRQV 3HUFHSWXDO UHVXOWV LQGLFDWH D FOHDU SUHIHUHQFHIRUWKHSURSRVHGVFKHPH  ,QGH[7HUPV²*DXVVLDQ0L[WXUH0RGHO *00 /LQH 6SHFWUDO )UHTXHQFLHV /6)  /RFDO /LQHDU 7UDQVIRUPDWLRQ //7   ,1752'8&7,21  9RLFH FRQYHUVLRQ LV GHILQHG WR EH D PRGLILFDWLRQ RI WKH VSHHFK VLJQDO LQ ZKLFK WKH SHUFHLYHG VSHDNHU LGHQWLW\ LV FKDQJHG ZKLOH SUHVHUYLQJ WKH FRQWHQW DQG TXDOLW\ 6XFK D FRQYHUVLRQLQYROYHVWKHPRGLILFDWLRQRISURVRGLFSURSHUWLHV DVZHOODVDWUDQVIRUPDWLRQRIWKHVSHFWUDOIHDWXUHV$VDFRUH SDUW RI YRLFH FRQYHUVLRQ VSHFWUDO WUDQVIRUPDWLRQ KDV UHFHLYHG PRVW LQWHUHVW LQ WKLV UHVHDUFK DUHD DQG LV DOVR WKH VFRSHRIWKLVDUWLFOH $ UHPDUNDEOH QXPEHU RI WHFKQLTXHV IRU VSHFWUDO FRQYHUVLRQ KDYH EHHQ SURSRVHG LQ WKH OLWHUDWXUH &RGHERRN PDSSLQJ >@ *00 >@ IUHTXHQF\ ZDUSLQJ >@ DUWLILFLDO QHXUDO QHWZRUNV >@ OLQHDU WUDQVIRUPDWLRQV >@ ELOLQHDU PRGHOV >@ HLJHQYRLFHV >@ DQG PD[LPXP OLNHOLKRRG HVWLPDWLRQ RI VSHFWUDO SDUDPHWHU WUDMHFWRU\ >@ DUH VRPH RI WKH PRVW UHSUHVHQWDWLYH :KLOH GHDOLQJ EHWWHU ZLWK RQH RU DQRWKHU LVVXH DVVRFLDWHG ZLWK WKH FRQYHUVLRQ DOO WKH DERYH PHWKRGVKDYHWKHLURZQOLPLWDWLRQV 0DQ\ H[LVWLQJ DSSURDFKHV LQYROYH OLQHDU FRQYHUVLRQ IXQFWLRQV DQG W\SLFDOO\ VXIIHU IURP WZR LPSRUWDQW

978-1-4673-0046-9/12/$26.00 ©2012 IEEE

4517

GUDZEDFNV 2QH RI WKHP LV UHODWHG WR WKH IUDPH EDVHG RSHUDWLRQ LQ ZKLFK WKH WHPSRUDO FRQWLQXLW\ RI WKH VSHFWUDO IHDWXUHV LV LJQRUHG 7KH VHFRQG LVVXH LV WKH VRFDOOHG RYHU VPRRWKLQJ FKDUDFWHUL]HG E\ DQ XQGHVLUHG VPRRWKLQJ RI WKH SDUDPHWHUWUDFNVDQGFRQYHUWHGVSHFWUD7KHFRPELQHGHIIHFW RIWKHVHGUDZEDFNVLVDSRRUVSHHFKTXDOLW\ 7KH *00 EDVHG DSSURDFK LV YHU\ SRSXODU DQG UHSUHVHQWDWLYH RI WKH FODVV RI PHWKRGV EDVHG RQ OLQHDU WUDQVIRUPDWLRQV ,Q *00 EDVHG FRQYHUVLRQ D OLQHDU WUDQVIRUPDWLRQ LV WUDLQHG IRU HDFK *DXVVLDQ FRPSRQHQW DQG WKHUHVXOWLVFRPSXWHGDVDZHLJKWHGVXPRIORFDOUHJUHVVLRQ IXQFWLRQV LQ DQ DWWHPSW WR DYRLG VXGGHQ FKDQJHV RI WKH FRQYHUVLRQ IXQFWLRQ ,Q UHDOLW\ D IUDPH¶V GHFRPSRVLWLRQ LV GRPLQDWHG E\ RQO\ RQH PL[WXUH FRPSRQHQW >@ PDNLQJ WKH PHWKRGVXVFHSWLEOHWRGLVFRQWLQXLWLHV,QDGGLWLRQWKH*00 WHFKQLTXHLVDOVRDIIHFWHGE\RYHUVPRRWKLQJ ,Q WKLV SDSHU ZH SURSRVH DVSHFWUDOFRQYHUVLRQVFKHPH ZKLFK WUDLQV DQ LQGLYLGXDO OLQHDU WUDQVIRUPDWLRQ IRU HDFK IHDWXUH YHFWRU 7KH PHWKRG XVHV DQ XQGHUO\LQJ FRGHERRN WUDLQHGIURPDOLJQHGGDWDRIWKHWZRVSHDNHUVDQGWKHOLQHDU WUDQVIRUPDWLRQ LV FRPSXWHG RQ D VHOHFWHG VHW RI FRGHERRN FHQWHUVVLWXDWHGLQWKHSUR[LPLW\RIWKHLQSXWVSHFWUDOYHFWRU LQWKHDFRXVWLFVSDFH%\IRFXVLQJRQWKHORFDOSURSHUWLHVRI WKH DFRXVWLF VSDFH WKH SURSRVHG PHWKRG LV VKRZQ WR HIIHFWLYHO\ UHGXFH WKH RYHUVPRRWKLQJ 2XU OLVWHQLQJ WHVWV VXJJHVW WKDW WKH SURSRVHG VFKHPH LV SUREDEO\ DIIHFWHG WR D OHVVHU GHJUHH E\ GLVFRQWLQXLW\ DUWLIDFWV WKDQ WKH *00 DSSURDFK :KLOH VXIIHULQJ VHULRXV OLPLWDWLRQV DV D FRQYHUVLRQ PHWKRGLQLWVHOIWKHFRGHERRNKDVWKHIDYRUDEOHSURSHUW\RI JRRG GHWDLO SUHVHUYDWLRQ ZKLFK EHQHILWV WKH SURSRVHG DOJRULWKPZKHUHVXFKOLPLWDWLRQVDUHDYRLGHG 7KH DUWLFOH FRQWLQXHV LQ 6HFWLRQ  ZLWK D WHFKQLFDO GHVFULSWLRQRIWKHSURSRVHGPHWKRG6XEMHFWLYHOLVWHQLQJWHVW UHVXOWVDQGRWKHUH[SHULPHQWVDUHSUHVHQWHGDQGGLVFXVVHGLQ 6HFWLRQ  7KH DUWLFOH HQGV ZLWK FRQFOXVLRQV DQG GLUHFWLRQV IRUIXWXUHUHVHDUFKSUHVHQWHGLQ6HFWLRQ  /2&$//,1($575$16)250$7,21  7KH XVH RI OLQHDU WUDQVIRUPDWLRQ IRU VSHFWUDO FRQYHUVLRQ LV QRWQHZ$QLPSRUWDQWQXPEHURIVROXWLRQVEDVHGRQOLQHDU WUDQVIRUPDWLRQKDYHEHHQSURSRVHGLQWKHOLWHUDWXUH

ICASSP 2012

,Q>@WKHDOLJQHGVSHFWUDOYHFWRUVRIVRXUFHDQGWDUJHW VSHDNHUV DUH ILUVW GLYLGHG LQWR D QXPEHU RI FODVVHV DQG D OLQHDUWUDQVIRUPDWLRQLVWUDLQHGIRUHDFKFODVV$OOWKHOLQHDU WUDQVIRUPDWLRQV FRQWULEXWHWRWKHFRQYHUVLRQRIHDFKVRXUFH YHFWRU LQ WKH IRUP RI D ZHLJKWHG VXP ZKHUH WKH ZHLJKWV UHSUHVHQWSUREDELOLWLHVWKDWWKHVRXUFHYHFWRUEHORQJVWRWKH FRUUHVSRQGLQJFODVV7KH*00EDVHGVROXWLRQ>@ZRUNVLQ D VLPLODU ZD\ XVLQJ RQH OLQHDU WUDQVIRUPDWLRQ IRU HDFK PL[WXUHFRPSRQHQW %\ DQDORJ\ ZLWK >@ ZKLFK DUJXHV WKDW OLQHDU FRPELQDWLRQVRYHUODUJHVHWVRIFXUYHVDUHERXQGWRSURGXFH DYHUDJHG UHVXOWV DQG GHVWUR\ FKDUDFWHULVWLF GHWDLOV ZH EHOLHYH WKDW DOORZLQJ DOO WKH OLQHDU WUDQVIRUPDWLRQV WR FRQWULEXWH WR WKH FRQYHUVLRQ LV OLNHO\ WR SURGXFH D VLPLODU DYHUDJLQJ HIIHFW HTXLYDOHQW WR RYHUVPRRWKLQJ 6LPLODU WR )UHHPDQHWDOZHEHOLHYHLWZRXOGEHEHQHILFLDOWRUHVWULFW WKHQXPEHURIOLQHDUWUDQVIRUPDWLRQVLQYROYHGLQFRQYHUVLRQ WR RQO\ D IHZ FRUUHVSRQGLQJ WR WKH PRVW VLPLODU VSHHFK FODVVHV,QWKLVSDSHUZHWDNHWKLVLGHDIRUZDUGDQGSURSRVH D ORFDO UHJUHVVLRQ DSSURDFK ZKHUH HDFK VRXUFH YHFWRU LV FRQYHUWHG ZLWK DQ LQGLYLGXDO OLQHDU WUDQVIRUPDWLRQ WUDLQHG ORFDOO\ ZLWKLQ WKH QHLJKERUKRRG RI WKH LQSXW YHFWRU 7KLV PHWKRGFDQEHVHHQLQVRPHVHQVHDVDWUDGHRIIEHWZHHQWKH PDSSLQJFRGHERRNVDQGIRULQVWDQFHWKHWUDGLWLRQDO*00 DSSURDFK $VVXPH WKDW RXU WUDLQLQJ VHW FRQVLVWV RI WZR WLPH DOLJQHG VHTXHQFHV RI VRXUFH DQG WDUJHW VSHFWUDO YHFWRUV GHQRWHG;DQG[ \ @ 

>[ [  [1 @

0

7 7 Q


\ \  \1 @ 

ª ª P[ º ª P[ º ª P4[ º º « « \ » « \ »  « \ » »  «¬ P4 ¼» ¼» ¬« ¬ P ¼ ¬ P  ¼

 

G ( [ P

[P  [ T

^

`

DQG P T N

ªP « «¬P



: ZKHUH 1

>P

[

1

[ T

[ 7

1[ .



1 [ 7 1 \ 

@ DQG 1 >P 7

P T[ PT[ 



\



\ T

PT\ P T\ 

.

@  7

7KHOHDVWVTXDUHVVROXWLRQPLQLPL]HVWKHFULWHULRQ .

&

¦ P N 

˜ :  P T\N 7

[ 7 TN







)LQDOO\WKHFRQYHUWHGUHVXOWIRU[LVFRPSXWHGDV

[7 ˜ : 

\FRQY 7



7KHFRQYHUVLRQRIDQHQWLUHVHTXHQFHRIVRXUFHYHFWRUV FDQ EH REWDLQHG E\UHSHDWLQJIRUHDFKYHFWRUWKHSURFHGXUH GHVFULEHGDERYH ,Q SUDFWLFH LW ZDV QRWLFHG WKDW WKH TXDOLW\ RI WKH FRQYHUVLRQLVVHQVLWLYHWRWKHVHOHFWHGQHLJKERUKRRGDQGWKH W\SHRIOLQHDUWUDQVIRUPDWLRQXVHG)LUVWO\LWZDVIRXQGWREH EHQHILFLDOWRHVWLPDWHEDQGGLDJRQDOPDWULFHVLQVWHDGRIIXOO RQHV JLYHQ WKDW WKH FRUUHODWLRQ LV KLJKHVW EHWZHHQ QHLJKERU HOHPHQWVRIDQ/6)YHFWRU6HFRQGO\LWZDVIRXQGEHQHILFLDO WRXVH\FRQYIRUDQHZVHOHFWLRQRIQHLJKERUVPLQLPL]LQJ

§ª [ º ·  PT ¸¸ G ( ¨¨ « » © ¬ \FRQY ¼ ¹

ª [ º « \ »  PT  ¬ FRQY ¼



DQG LWHUDWH WKH VDPH VWHSV XQWLO WKH QHLJKERUKRRGV GHWHUPLQHG LQ FRQVHFXWLYH VWHSV EHFRPH YLUWXDOO\ LGHQWLFDO RUVXIILFLHQWO\VLPLODU7KLVLVHTXLYDOHQWWRDFRQYHUJHQFHRI \FRQY 7KH SURFHVV ZDV IRXQG WR EH SVHXGRFRQYHUJHQW DQG FDQEHVWRSSHGZLWKDQDUELWUDU\WKUHVKROGFULWHULRQ)LJXUH LOOXVWUDWHVWKLVSVHXGRFRQYHUJHQFH 80 60 40 20 0 0

5

10 15 Iteration number Euclidean distance between consecutive yconv estimates

20

400



ZKHUH TN DUH FRGHERRN LQGLFHV RI WKH VHOHFWHG FHQWHUV [ TN \ TN

1\

ZKLFKKDVWKHOHDVWVTXDUHVVROXWLRQ



7KHQHLJKERUKRRGFDQEHH[SUHVVHGIRUPDOO\DV 1 [ P T  PT   P T . 



Number of new neighbors at the curent iteration

7KH LGHD RI ORFDO UHJUHVVLRQ LV WR ILW ORFDO PRGHOV WR QHDUE\GDWD7KHFRQYHUVLRQRIDVRXUFHYHFWRU[UHTXLUHVLQ DILUVWSKDVHWKHVHOHFWLRQRIDVRFDOOHGQHLJKERUKRRGRI[ RUVHWRIFRGHERRNFHQWHUVVLWXDWHGLQWKHSUR[LPLW\RI[7KH VLPSOHVW ZD\ WR GHWHUPLQH WKH QHLJKERUKRRG RI [ LV WR FRQVLGHULWV.QHDUHVWQHLJKERUVWKDWPLQLPL]HWKHGLVWDQFH [ T

1 [ ˜:

Number of neighbors

;

7 Q

PT\N 7 

7KHOLQHDUWUDQVIRUPDWLRQ:LVREWDLQHGE\VROYLQJ

º »  »¼

Distance

FRPELQHGYHFWRUV ]Q

P T[N 7 ˜ :

300 200 100 0 0

,Q D VHFRQG SKDVH WKH SURSRVHG PHWKRG GHWHUPLQHV D OLQHDU WUDQVIRUPDWLRQ IRU HDFK QHLJKERUKRRG XVLQJ D OHDVW VTXDUHVFULWHULRQ/RFDOPRGHOLQJIDYRUVVLPSOHPRGHOVDQG DVLPSOHWUDLQLQJFULWHULRQ7KHOLQHDUUHJUHVVLRQPRGHOLV

4518

5

10 Iteration number

15

20



)LJXUH3VHXGRFRQYHUJHQFHRIQHLJKERUKRRGVHOHFWLRQ :HREVHUYHWKDWWKHDOJRULWKPFRXOGKDYHEHHQDSSOLHG GLUHFWO\RQWKHDOLJQHGWUDLQLQJGDWDLQVWHDGRIWKHFRGHERRN

 7KH DOJRULWKP IRU VSHFWUDO FRQYHUVLRQ SUHVHQWHG LQ WKH SUHYLRXV VHFWLRQ KDV EHHQ DSSOLHG RQ GLPHQVLRQDO OLQH VSHFWUDO IUHTXHQFLHV /6)  YHFWRUV DQG WKH UHVXOWV DUH GHPRQVWUDWHG LQ WKLV VHFWLRQ ZLWK WZR FURVV JHQGHU H[DPSOHV 7KH VHFWLRQ SUHVHQWV D FRPSDULVRQ ZLWK WKH SRSXODU *00 EDVHG DSSURDFK SURYLGLQJ REMHFWLYH DQG VXEMHFWLYHUHVXOWV  $FRXVWLF'DWD  &08 $UFWLF GDWDEDVH KWWSIHVWYR[RUJFPXBDUFWLF  LV D SXEOLFO\ DYDLODEOH FRUSXV RI SDUDOOHO VSHHFK VDPSOHG DW  N+] :HXVHGWKH&/% IHPDOH DQG506 PDOH VSHDNHUV IURP WKH &08 $UFWLF GDWDEDVH WR WHVW FRQYHUVLRQ LQ ERWK GLUHFWLRQVIURPPDOHYRLFHWRIHPDOHDQGIURPIHPDOHYRLFH WRPDOH $SDUDOOHOVHWRIVHQWHQFHVZDVXVHGDVWUDLQLQJGDWD DPRXQWLQJ WR DSSUR[LPDWHO\  SDLUV RI VRXUFH DQG WDUJHW /6) YHFWRUV DIWHU WLPH DOLJQPHQW $QRWKHU  VHQWHQFHVZHUHXVHGIRUWHVWLQJ  0RGHO6HWWLQJV  *00 7RR IHZ FRPSRQHQWV DOWKRXJK UHOLDEO\ HVWLPDWHG JLYH DQ LQDFFXUDWH DSSUR[LPDWLRQ RI WKH WUDLQLQJ GDWD ZKLOH WKH HVWLPDWLRQ RI WRR PDQ\ FRPSRQHQWV LV XQUHOLDEOH FDXVLQJ RYHUILWWLQJ ,Q FKRRVLQJ D UHIHUHQFH *00 IRU WKH FRPSDULVRQ ZLWK WKH SURSRVHG DSSURDFK VXFK SUREOHPV DUH DYRLGHGDVIROORZV7KHSHUIRUPDQFHRI*00PRGHOVZLWK GLIIHUHQWQXPEHUVRIFRPSRQHQWVZDVHYDOXDWHGRYHUWKHWHVW VHWDQGWKHPRGHOZLWKWKHORZHVWHUURUZDVVHOHFWHG $V LOOXVWUDWHG LQ )LJXUH  WKH IHPDOH WR PDOH GLUHFWLRQ UHTXLUHV  FRPSRQHQWV ZKLOH  FRPSRQHQWV DUH QHHGHG WR FRQYHUWWKHPDOHLQWRIHPDOHYRLFH7KHPHDQVTXDUHGHUURU 06( ILJXUHVDUHEDVHGRQWKHGHILQLWLRQJLYHQLQ>@ (YHQ WKRXJK WKH *00 ZDV WXQHG GLUHFWO\ RQ WKH WHVW VHWDVLPLODUWXQLQJFRXOGEHSHUIRUPHGE\FURVVYDOLGDWLRQ XVLQJRQO\WKHWUDLQLQJVHW  3URSRVHG0HWKRG /RFDO/LQHDU7UDQVIRUPDWLRQ  7KH WXQLQJ RI WKH SURSRVHG PHWKRG LV PDLQO\ EDVHG RQ SHUFHSWXDO HYDOXDWLRQ $ FRGHERRN VL]H RI  ZDV XVHG ZKLOHWKHQHLJKERUKRRGVL]HVZHUHWXQHGVHSDUDWHO\IRUHDFK GLUHFWLRQ OHDGLQJ WR YDOXHV RI  IHPDOH WR PDOH  DQG  PDOHWRIHPDOH 7KHOLQHDUWUDQVIRUPDWLRQVZHUHUHVWULFWHG WRWULGLDJRQDOPDWULFHV 7KH QHLJKERUKRRG VL]H ZDV IRXQG WR DFW DV D WUDGHRII SURGXFLQJ XQVWDEOH UHVXOWV ZKHQ WKH QHLJKERUKRRG LV WRR VPDOO DQG H[FHVVLYHO\ DYHUDJHG RYHUVPRRWKHG  UHVXOWV ZKHQQHLJKERUKRRGVDUHODUJH 

 6XEMHFWLYH/LVWHQLQJ7HVW  7KHVSHHFKVDPSOHVHYDOXDWHGLQWKHOLVWHQLQJWHVWVDUHEDVHG RQ WDUJHW VSHDNHU YHUVLRQV RI WKH WHVW XWWHUDQFHV LQ ZKRVH SDUDPHWULF UHSUHVHQWDWLRQV RQO\ /6)V KDYH EHHQ UHSODFHG ZLWK FRQYHUWHG RQHV 7KLV PLPLFV WKH FDVH ZKHQ DOO RWKHU IHDWXUHVDUHLGHDOO\FRQYHUWHGIRFXVLQJWKHHYDOXDWLRQRQWKH DFWXDOVSHFWUDOFRQYHUVLRQ MSE over different GMM components − female to male −

4

3.8

x 10

MSE

min

MSE

(;3(5,0(176

−> 8 components

3.6 3.4 3.2 1

2 4

4.2

x 10

4

8 16 32 GMM components MSE over different GMM components − male to female − MSE

min

MSE



64

128

−> 16 components

4 3.8 3.6 1

2

4

8 16 GMM components

32

64

128



)LJXUH  0HDQ VTXDUHG HUURU RI *00V IRU GLIIHUHQW QXPEHUVRIFRPSRQHQWVPHDVXUHGRYHUWKHWHVWVHW )RUHDFKFRQYHUVLRQGLUHFWLRQDPRGLILHG026WHVWZDV FDUULHG RXW E\ WHQ OLVWHQHUV RQ WHQ WHVW VHQWHQFHV 7KH SURSRVHGPHWKRG //7 DQGWKH*00EDVHGDSSURDFKZHUH FRPSDUHGLQWHUPVRIVSHHFKTXDOLW\DQGVXFFHVVRILGHQWLW\ PDSSLQJ7KHVHFULWHULDDUHHYDOXDWHGZLWKVFRUHVEHWZHHQ  DQG  ZLWK  LQGLFDWLQJ WKDW ³*00//7 SHUIRUPV PXFKEHWWHU´IRU³*00//7SHUIRUPVEHWWHU´DQG LQGLFDWLQJSHUFHSWXDOO\LGHQWLFDOSHUIRUPDQFH7KHUHVXOWVRI WKHOLVWHQLQJWHVWDUHLOOXVWUDWHGLQ7DEOH 7DEOH6XEMHFWLYHOLVWHQLQJWHVWVFRUHVZLWK FRQILGHQFHLQWHUYDOV  4XDOLW\ ,GHQWLW\ )HPDOHWRPDOH “ “ 0DOHWRIHPDOH “ “ $ SRVVLEOH H[SODQDWLRQ IRU WKH PDOH WR IHPDOH UHVXOW LV WKDWWKHKLJKSLWFKHGIHPDOHYRLFHVHHPVWRPDVNWKHTXDOLW\ SUREOHPVPDNLQJWKHWZRPHWKRGVVRXQGPRUHVLPLODU 7KHVXEMHFWLYHVFRUHVLQGLFDWHWKHJHQHUDOSUHIHUHQFHRI WKHSURSRVHGDSSURDFKRYHUWKH*00EDVHGV\VWHP  2YHU6PRRWKLQJ5HGXFWLRQ  7KHFRQYHUWHGVSHFWUDDQG/6)WUDFNVLOOXVWUDWHGLQ)LJXUH LQGLFDWHDUHGXFWLRQRIWKHRYHUVPRRWKLQJLQWKHFDVHRIWKH SURSRVHGDSSURDFKLQFRPSDULVRQWR*00

4519

$&.12:/('*(0(17  7KLV ZRUN ZDV VXSSRUWHG E\ WKH $FDGHP\ RI )LQODQG DSSOLFDWLRQ QXPEHU  )LQQLVK 3URJUDPPH IRU &HQWUHVRI([FHOOHQFHLQ5HVHDUFK   5()(5(1&(6 

Converted spectral envelopes Magnitude (dB)

20

0 −10 −20 0

Frequency (Hz)

Proposed (LLT) GMM

10

1000

2000

3000 4000 5000 6000 7000 Frequency (Hz) Converted LSF tracks 9 to 14 for a speech segment

8000

Proposed (LLT) GMM

6000 5000 4000 0

50

Frame time index

100

150



)LJXUH  2YHUVPRRWKLQJ UHGXFWLRQ IRU VSHFWUDO HQYHORSHV WRS DQG/6)WUDFNV ERWWRP  6WDQGDUG GHYLDWLRQ PHDVXUHPHQWV RI FRQYHUWHG DQG RULJLQDO WDUJHW VSHFWUD LQ IUHTXHQF\  DQG /6) WUDFNV LQ WLPH DUHFDOFXODWHGRYHUWKHHQWLUHWHVWVHWDQGVXPPDUL]HG LQ7DEOHFRQILUPLQJWKHRYHUVPRRWKLQJUHGXFWLRQ 7DEOH$YHUDJHVWDQGDUGGHYLDWLRQRIVSHFWUDO PDJQLWXGH LQG% DQG/6)WUDFNV LQ+]   0DJQLWXGH G%  /6)WUDFNV +]   3URSRVHG *00 7JW 3URSRVHG *00 7JW )HPDOHWR       PDOH 0DOHWR       IHPDOH /RFDO PRGHOLQJ UDWKHU WKDQ WKH LQWHUSRODWLRQ RI ORFDO PRGHOV IURP DFRXVWLFDOO\ UHPRWH UHJLRQV PDNHV WKH SURSRVHG DSSURDFK FDSDEOH WR FDSWXUH GHWDLOV EHWWHU DQG UHGXFHWKHDYHUDJLQJHIIHFW  &21&/86,216  7KLVDUWLFOHLQWURGXFHGDQHZPHWKRGIRUVSHFWUDOFRQYHUVLRQ EDVHGRQORFDOUHJUHVVLRQ/LQHDUWUDQVIRUPDWLRQPRGHOVDUH ILWHYHU\WLPHWRORFDOGDWDIRUHDFKVRXUFHYHFWRUDVRSSRVHG WRWKHW\SLFDOLQWHUSRODWLRQRIOLQHDUPRGHOV7KHPHWKRGZDV VKRZQ WR HIIHFWLYHO\ UHGXFH RYHUVPRRWKLQJ DQG REWDLQHG IDYRUDEOH SUHIHUHQFH VFRUHV LQ D VXEMHFWLYH HYDOXDWLRQ DJDLQVWWKHSRSXODU*00EDVHGDSSURDFK 2Q WKH GRZQVLGH WKH SURSRVHG PHWKRG XVHV KHDYLHU FRPSXWDWLRQIRUFRQYHUVLRQDVOLQHDUWUDQVIRUPDWLRQVGHSHQG RQWKHLQSXWYHFWRUDQGKDYHWREHHVWLPDWHGDWUXQWLPH ,QWHUHVWLQJGLUHFWLRQVIRUIXWXUHZRUNZRXOGEHWRVWXG\ DOWHUQDWLYH ZD\V IRU QHLJKERUKRRG VHOHFWLRQ DQG DOWHUQDWLYH ORFDOPRGHOV  

4520

>@ / $UVODQ DQG ' 7DONLQ ³9RLFH &RQYHUVLRQ E\ &RGHERRN 0DSSLQJ RI /LQH 6SHFWUDO )UHTXHQFLHV DQG ([FLWDWLRQ 6SHFWUXP´ ,Q WK 3URFHHGLQJV RI (XURSHDQ &RQIHUHQFH RQ 6SHHFK &RPPXQLFDWLRQDQG7HFKQRORJ\5KRGHV*UHHFH  >@ $ .DLQ DQG 0: 0DFRQ ³6SHFWUDO 9RLFH &RQYHUVLRQ IRU 7H[WWR6SHHFK 6\QWKHVLV´ ,Q 3URFHHGLQJV RI ,QWHUQDWLRQDO &RQIHUHQFH RQ $FRXVWLFV 6SHHFK DQG 6LJQDO 3URFHVVLQJ 6HDWWOH 86$9ROSS  >@=6KXDQJ5%DNLVDQG@ 0 1DUHQGUDQDWK + 0XUWK\ 6 5DMHQGUDQ DQG 1 @ @ 93RSD-1XUPLQHQDQG0*DEERXM$6WXG\RI%LOLQHDU 0RGHOVLQ9RLFH&RQYHUVLRQ-RXUQDORI6LJQDODQG,QIRUPDWLRQ 3URFHVVLQJYROQR0D\  >@ 7 7RGD @ 7 7RGD $: %ODFN DQG . 7RNXGD ³9RLFH &RQYHUVLRQ %DVHG RQ 0D[LPXP/LNHOLKRRG (VWLPDWLRQ RI 6SHFWUDO 3DUDPHWHU 7UDMHFWRU\´ ,((( 7UDQVDFWLRQV RQ $XGLR 6SHHFK DQG /DQJXDJH 3URFHVVLQJ9ROXPH,VVXHSS1RY  >@(+HODQGHU79LUWDQHQ-1XUPLQHQDQG0*DEERXM³9RLFH &RQYHUVLRQ8VLQJ3DUWLDO/HDVW6TXDUHV5HJUHVVLRQ´,(((7UDQV RQ6SHHFKDQG$XGLR3URFHVVLQJYROQRSS-XO\   >@ +