LOCAL LINEAR TRANSFORMATION FOR VOICE CONVERSION. Victor Popa1, Hanna Silen1, Jani Nurminen2 and Moncef Gabbouj1. 1Department of Signal ...
/2&$//,1($575$16)250$7,21)2592,&(&219(56,21 9LFWRU3RSD +DQQD6LOHQ -DQL1XUPLQHQDQG0RQFHI*DEERXM 'HSDUWPHQWRI6LJQDO3URFHVVLQJ7DPSHUH8QLYHUVLW\RI7HFKQRORJ\7DPSHUH)LQODQG 1RNLD7DPSHUH)LQODQG $%675$&7 0DQ\ SRSXODU DSSURDFKHV WR VSHFWUDO FRQYHUVLRQ LQYROYH OLQHDU WUDQVIRUPDWLRQV GHWHUPLQHG IRU SDUWLFXODU DFRXVWLF FODVVHV DQG FRPSXWH WKH FRQYHUWHG UHVXOW DV D OLQHDU FRPELQDWLRQ EHWZHHQ GLIIHUHQW ORFDO WUDQVIRUPDWLRQV LQ DQ DWWHPSW WR HQVXUH D FRQWLQXRXV FRQYHUVLRQ 7KHVH PHWKRGV RIWHQ SURGXFH RYHUVPRRWKHG VSHFWUD DQG SDUDPHWHU WUDFNV 7KH SURSRVHG PHWKRG FRPSXWHV DQ LQGLYLGXDO OLQHDU WUDQVIRUPDWLRQ IRU HYHU\ IHDWXUH YHFWRU EDVHG RQ D VPDOO QHLJKERUKRRG LQ WKH DFRXVWLF VSDFH WKXV SUHVHUYLQJ ORFDO GHWDLOV 7KH PHWKRG HIIHFWLYHO\ UHGXFHV WKH RYHUVPRRWKLQJ E\ HOLPLQDWLQJ XQGHVLUHG FRQWULEXWLRQV IURP DFRXVWLFDOO\ UHPRWH UHJLRQV 7KH PHWKRG LV HYDOXDWHG LQ OLVWHQLQJ WHVWV DJDLQVW WKH ZHOONQRZQ *DXVVLDQ 0L[WXUH 0RGHO EDVHG FRQYHUVLRQUHSUHVHQWDWLYHRIWKHFODVVRIPHWKRGVLQYROYLQJ OLQHDU WUDQVIRUPDWLRQV 3HUFHSWXDO UHVXOWV LQGLFDWH D FOHDU SUHIHUHQFHIRUWKHSURSRVHGVFKHPH ,QGH[7HUPV²*DXVVLDQ0L[WXUH0RGHO*00 /LQH 6SHFWUDO )UHTXHQFLHV /6) /RFDO /LQHDU 7UDQVIRUPDWLRQ //7 ,1752'8&7,21 9RLFH FRQYHUVLRQ LV GHILQHG WR EH D PRGLILFDWLRQ RI WKH VSHHFK VLJQDO LQ ZKLFK WKH SHUFHLYHG VSHDNHU LGHQWLW\ LV FKDQJHG ZKLOH SUHVHUYLQJ WKH FRQWHQW DQG TXDOLW\ 6XFK D FRQYHUVLRQLQYROYHVWKHPRGLILFDWLRQRISURVRGLFSURSHUWLHV DVZHOODVDWUDQVIRUPDWLRQRIWKHVSHFWUDOIHDWXUHV$VDFRUH SDUW RI YRLFH FRQYHUVLRQ VSHFWUDO WUDQVIRUPDWLRQ KDV UHFHLYHG PRVW LQWHUHVW LQ WKLV UHVHDUFK DUHD DQG LV DOVR WKH VFRSHRIWKLVDUWLFOH $ UHPDUNDEOH QXPEHU RI WHFKQLTXHV IRU VSHFWUDO FRQYHUVLRQ KDYH EHHQ SURSRVHG LQ WKH OLWHUDWXUH &RGHERRN PDSSLQJ >@ *00 >@ IUHTXHQF\ ZDUSLQJ >@ DUWLILFLDO QHXUDO QHWZRUNV >@ OLQHDU WUDQVIRUPDWLRQV >@ ELOLQHDU PRGHOV >@ HLJHQYRLFHV >@ DQG PD[LPXP OLNHOLKRRG HVWLPDWLRQ RI VSHFWUDO SDUDPHWHU WUDMHFWRU\ >@ DUH VRPH RI WKH PRVW UHSUHVHQWDWLYH :KLOH GHDOLQJ EHWWHU ZLWK RQH RU DQRWKHU LVVXH DVVRFLDWHG ZLWK WKH FRQYHUVLRQ DOO WKH DERYH PHWKRGVKDYHWKHLURZQOLPLWDWLRQV 0DQ\ H[LVWLQJ DSSURDFKHV LQYROYH OLQHDU FRQYHUVLRQ IXQFWLRQV DQG W\SLFDOO\ VXIIHU IURP WZR LPSRUWDQW
978-1-4673-0046-9/12/$26.00 ©2012 IEEE
4517
GUDZEDFNV 2QH RI WKHP LV UHODWHG WR WKH IUDPH EDVHG RSHUDWLRQ LQ ZKLFK WKH WHPSRUDO FRQWLQXLW\ RI WKH VSHFWUDO IHDWXUHV LV LJQRUHG 7KH VHFRQG LVVXH LV WKH VRFDOOHG RYHU VPRRWKLQJ FKDUDFWHUL]HG E\ DQ XQGHVLUHG VPRRWKLQJ RI WKH SDUDPHWHUWUDFNVDQGFRQYHUWHGVSHFWUD7KHFRPELQHGHIIHFW RIWKHVHGUDZEDFNVLVDSRRUVSHHFKTXDOLW\ 7KH *00 EDVHG DSSURDFK LV YHU\ SRSXODU DQG UHSUHVHQWDWLYH RI WKH FODVV RI PHWKRGV EDVHG RQ OLQHDU WUDQVIRUPDWLRQV ,Q *00 EDVHG FRQYHUVLRQ D OLQHDU WUDQVIRUPDWLRQ LV WUDLQHG IRU HDFK *DXVVLDQ FRPSRQHQW DQG WKHUHVXOWLVFRPSXWHGDVDZHLJKWHGVXPRIORFDOUHJUHVVLRQ IXQFWLRQV LQ DQ DWWHPSW WR DYRLG VXGGHQ FKDQJHV RI WKH FRQYHUVLRQ IXQFWLRQ ,Q UHDOLW\ D IUDPH¶V GHFRPSRVLWLRQ LV GRPLQDWHG E\ RQO\ RQH PL[WXUH FRPSRQHQW >@ PDNLQJ WKH PHWKRGVXVFHSWLEOHWRGLVFRQWLQXLWLHV,QDGGLWLRQWKH*00 WHFKQLTXHLVDOVRDIIHFWHGE\RYHUVPRRWKLQJ ,Q WKLV SDSHU ZH SURSRVH DVSHFWUDOFRQYHUVLRQVFKHPH ZKLFK WUDLQV DQ LQGLYLGXDO OLQHDU WUDQVIRUPDWLRQ IRU HDFK IHDWXUH YHFWRU 7KH PHWKRG XVHV DQ XQGHUO\LQJ FRGHERRN WUDLQHGIURPDOLJQHGGDWDRIWKHWZRVSHDNHUVDQGWKHOLQHDU WUDQVIRUPDWLRQ LV FRPSXWHG RQ D VHOHFWHG VHW RI FRGHERRN FHQWHUVVLWXDWHGLQWKHSUR[LPLW\RIWKHLQSXWVSHFWUDOYHFWRU LQWKHDFRXVWLFVSDFH%\IRFXVLQJRQWKHORFDOSURSHUWLHVRI WKH DFRXVWLF VSDFH WKH SURSRVHG PHWKRG LV VKRZQ WR HIIHFWLYHO\ UHGXFH WKH RYHUVPRRWKLQJ 2XU OLVWHQLQJ WHVWV VXJJHVW WKDW WKH SURSRVHG VFKHPH LV SUREDEO\ DIIHFWHG WR D OHVVHU GHJUHH E\ GLVFRQWLQXLW\ DUWLIDFWV WKDQ WKH *00 DSSURDFK :KLOH VXIIHULQJ VHULRXV OLPLWDWLRQV DV D FRQYHUVLRQ PHWKRGLQLWVHOIWKHFRGHERRNKDVWKHIDYRUDEOHSURSHUW\RI JRRG GHWDLO SUHVHUYDWLRQ ZKLFK EHQHILWV WKH SURSRVHG DOJRULWKPZKHUHVXFKOLPLWDWLRQVDUHDYRLGHG 7KH DUWLFOH FRQWLQXHV LQ 6HFWLRQ ZLWK D WHFKQLFDO GHVFULSWLRQRIWKHSURSRVHGPHWKRG6XEMHFWLYHOLVWHQLQJWHVW UHVXOWVDQGRWKHUH[SHULPHQWVDUHSUHVHQWHGDQGGLVFXVVHGLQ 6HFWLRQ 7KH DUWLFOH HQGV ZLWK FRQFOXVLRQV DQG GLUHFWLRQV IRUIXWXUHUHVHDUFKSUHVHQWHGLQ6HFWLRQ /2&$//,1($575$16)250$7,21 7KH XVH RI OLQHDU WUDQVIRUPDWLRQ IRU VSHFWUDO FRQYHUVLRQ LV QRWQHZ$QLPSRUWDQWQXPEHURIVROXWLRQVEDVHGRQOLQHDU WUDQVIRUPDWLRQKDYHEHHQSURSRVHGLQWKHOLWHUDWXUH
ICASSP 2012
,Q>@WKHDOLJQHGVSHFWUDOYHFWRUVRIVRXUFHDQGWDUJHW VSHDNHUV DUH ILUVW GLYLGHG LQWR D QXPEHU RI FODVVHV DQG D OLQHDUWUDQVIRUPDWLRQLVWUDLQHGIRUHDFKFODVV$OOWKHOLQHDU WUDQVIRUPDWLRQV FRQWULEXWHWRWKHFRQYHUVLRQRIHDFKVRXUFH YHFWRU LQ WKH IRUP RI D ZHLJKWHG VXP ZKHUH WKH ZHLJKWV UHSUHVHQWSUREDELOLWLHVWKDWWKHVRXUFHYHFWRUEHORQJVWRWKH FRUUHVSRQGLQJFODVV7KH*00EDVHGVROXWLRQ>@ZRUNVLQ D VLPLODU ZD\ XVLQJ RQH OLQHDU WUDQVIRUPDWLRQ IRU HDFK PL[WXUHFRPSRQHQW %\ DQDORJ\ ZLWK >@ ZKLFK DUJXHV WKDW OLQHDU FRPELQDWLRQVRYHUODUJHVHWVRIFXUYHVDUHERXQGWRSURGXFH DYHUDJHG UHVXOWV DQG GHVWUR\ FKDUDFWHULVWLF GHWDLOV ZH EHOLHYH WKDW DOORZLQJ DOO WKH OLQHDU WUDQVIRUPDWLRQV WR FRQWULEXWH WR WKH FRQYHUVLRQ LV OLNHO\ WR SURGXFH D VLPLODU DYHUDJLQJ HIIHFW HTXLYDOHQW WR RYHUVPRRWKLQJ 6LPLODU WR )UHHPDQHWDOZHEHOLHYHLWZRXOGEHEHQHILFLDOWRUHVWULFW WKHQXPEHURIOLQHDUWUDQVIRUPDWLRQVLQYROYHGLQFRQYHUVLRQ WR RQO\ D IHZ FRUUHVSRQGLQJ WR WKH PRVW VLPLODU VSHHFK FODVVHV,QWKLVSDSHUZHWDNHWKLVLGHDIRUZDUGDQGSURSRVH D ORFDO UHJUHVVLRQ DSSURDFK ZKHUH HDFK VRXUFH YHFWRU LV FRQYHUWHG ZLWK DQ LQGLYLGXDO OLQHDU WUDQVIRUPDWLRQ WUDLQHG ORFDOO\ ZLWKLQ WKH QHLJKERUKRRG RI WKH LQSXW YHFWRU 7KLV PHWKRGFDQEHVHHQLQVRPHVHQVHDVDWUDGHRIIEHWZHHQWKH PDSSLQJFRGHERRNVDQGIRULQVWDQFHWKHWUDGLWLRQDO*00 DSSURDFK $VVXPH WKDW RXU WUDLQLQJ VHW FRQVLVWV RI WZR WLPH DOLJQHG VHTXHQFHV RI VRXUFH DQG WDUJHW VSHFWUDO YHFWRUV GHQRWHG;DQG[ \ @
>[ [ [1 @
0
7 7 Q
\ \ \1 @
ª ª P[ º ª P[ º ª P4[ º º « « \ » « \ » « \ » » «¬ P4 ¼» ¼» ¬« ¬ P ¼ ¬ P ¼
G ( [ P
[P [ T
^
`
DQG P T N
ªP « «¬P
: ZKHUH 1
>P
[
1
[ T
[ 7
1[ .
1 [ 7 1 \
@ DQG 1 >P 7
P T[ PT[
\
\ T
PT\ P T\
.
@ 7
7KHOHDVWVTXDUHVVROXWLRQPLQLPL]HVWKHFULWHULRQ .
&
¦ P N
: P T\N 7
[ 7 TN
)LQDOO\WKHFRQYHUWHGUHVXOWIRU[LVFRPSXWHGDV
[7 :
\FRQY 7
7KHFRQYHUVLRQRIDQHQWLUHVHTXHQFHRIVRXUFHYHFWRUV FDQ EH REWDLQHG E\UHSHDWLQJIRUHDFKYHFWRUWKHSURFHGXUH GHVFULEHGDERYH ,Q SUDFWLFH LW ZDV QRWLFHG WKDW WKH TXDOLW\ RI WKH FRQYHUVLRQLVVHQVLWLYHWRWKHVHOHFWHGQHLJKERUKRRGDQGWKH W\SHRIOLQHDUWUDQVIRUPDWLRQXVHG)LUVWO\LWZDVIRXQGWREH EHQHILFLDOWRHVWLPDWHEDQGGLDJRQDOPDWULFHVLQVWHDGRIIXOO RQHV JLYHQ WKDW WKH FRUUHODWLRQ LV KLJKHVW EHWZHHQ QHLJKERU HOHPHQWVRIDQ/6)YHFWRU6HFRQGO\LWZDVIRXQGEHQHILFLDO WRXVH\FRQYIRUDQHZVHOHFWLRQRIQHLJKERUVPLQLPL]LQJ
§ª [ º · PT ¸¸ G ( ¨¨ « » © ¬ \FRQY ¼ ¹
ª [ º « \ » PT ¬ FRQY ¼
DQG LWHUDWH WKH VDPH VWHSV XQWLO WKH QHLJKERUKRRGV GHWHUPLQHG LQ FRQVHFXWLYH VWHSV EHFRPH YLUWXDOO\ LGHQWLFDO RUVXIILFLHQWO\VLPLODU7KLVLVHTXLYDOHQWWRDFRQYHUJHQFHRI \FRQY 7KH SURFHVV ZDV IRXQG WR EH SVHXGRFRQYHUJHQW DQG FDQEHVWRSSHGZLWKDQDUELWUDU\WKUHVKROGFULWHULRQ)LJXUH LOOXVWUDWHVWKLVSVHXGRFRQYHUJHQFH 80 60 40 20 0 0
5
10 15 Iteration number Euclidean distance between consecutive yconv estimates
20
400
ZKHUH TN DUH FRGHERRN LQGLFHV RI WKH VHOHFWHG FHQWHUV [ TN \ TN
1\
ZKLFKKDVWKHOHDVWVTXDUHVVROXWLRQ
7KHQHLJKERUKRRGFDQEHH[SUHVVHGIRUPDOO\DV 1 [ P T PT P T .
Number of new neighbors at the curent iteration
7KH LGHD RI ORFDO UHJUHVVLRQ LV WR ILW ORFDO PRGHOV WR QHDUE\GDWD7KHFRQYHUVLRQRIDVRXUFHYHFWRU[UHTXLUHVLQ DILUVWSKDVHWKHVHOHFWLRQRIDVRFDOOHGQHLJKERUKRRGRI[ RUVHWRIFRGHERRNFHQWHUVVLWXDWHGLQWKHSUR[LPLW\RI[7KH VLPSOHVW ZD\ WR GHWHUPLQH WKH QHLJKERUKRRG RI [ LV WR FRQVLGHULWV.QHDUHVWQHLJKERUVWKDWPLQLPL]HWKHGLVWDQFH [ T
1 [ :
Number of neighbors
;
7 Q
PT\N 7
7KHOLQHDUWUDQVIRUPDWLRQ:LVREWDLQHGE\VROYLQJ
º » »¼
Distance
FRPELQHGYHFWRUV ]Q
P T[N 7 :
300 200 100 0 0
,Q D VHFRQG SKDVH WKH SURSRVHG PHWKRG GHWHUPLQHV D OLQHDU WUDQVIRUPDWLRQ IRU HDFK QHLJKERUKRRG XVLQJ D OHDVW VTXDUHVFULWHULRQ/RFDOPRGHOLQJIDYRUVVLPSOHPRGHOVDQG DVLPSOHWUDLQLQJFULWHULRQ7KHOLQHDUUHJUHVVLRQPRGHOLV
4518
5
10 Iteration number
15
20
)LJXUH3VHXGRFRQYHUJHQFHRIQHLJKERUKRRGVHOHFWLRQ :HREVHUYHWKDWWKHDOJRULWKPFRXOGKDYHEHHQDSSOLHG GLUHFWO\RQWKHDOLJQHGWUDLQLQJGDWDLQVWHDGRIWKHFRGHERRN
7KH DOJRULWKP IRU VSHFWUDO FRQYHUVLRQ SUHVHQWHG LQ WKH SUHYLRXV VHFWLRQ KDV EHHQ DSSOLHG RQ GLPHQVLRQDO OLQH VSHFWUDO IUHTXHQFLHV /6) YHFWRUV DQG WKH UHVXOWV DUH GHPRQVWUDWHG LQ WKLV VHFWLRQ ZLWK WZR FURVV JHQGHU H[DPSOHV 7KH VHFWLRQ SUHVHQWV D FRPSDULVRQ ZLWK WKH SRSXODU *00 EDVHG DSSURDFK SURYLGLQJ REMHFWLYH DQG VXEMHFWLYHUHVXOWV $FRXVWLF'DWD &08 $UFWLF GDWDEDVH KWWSIHVWYR[RUJFPXBDUFWLF LV D SXEOLFO\ DYDLODEOH FRUSXV RI SDUDOOHO VSHHFK VDPSOHG DW N+] :HXVHGWKH&/%IHPDOH DQG506PDOH VSHDNHUV IURP WKH &08 $UFWLF GDWDEDVH WR WHVW FRQYHUVLRQ LQ ERWK GLUHFWLRQVIURPPDOHYRLFHWRIHPDOHDQGIURPIHPDOHYRLFH WRPDOH $SDUDOOHOVHWRIVHQWHQFHVZDVXVHGDVWUDLQLQJGDWD DPRXQWLQJ WR DSSUR[LPDWHO\ SDLUV RI VRXUFH DQG WDUJHW /6) YHFWRUV DIWHU WLPH DOLJQPHQW $QRWKHU VHQWHQFHVZHUHXVHGIRUWHVWLQJ 0RGHO6HWWLQJV *00 7RR IHZ FRPSRQHQWV DOWKRXJK UHOLDEO\ HVWLPDWHG JLYH DQ LQDFFXUDWH DSSUR[LPDWLRQ RI WKH WUDLQLQJ GDWD ZKLOH WKH HVWLPDWLRQ RI WRR PDQ\ FRPSRQHQWV LV XQUHOLDEOH FDXVLQJ RYHUILWWLQJ ,Q FKRRVLQJ D UHIHUHQFH *00 IRU WKH FRPSDULVRQ ZLWK WKH SURSRVHG DSSURDFK VXFK SUREOHPV DUH DYRLGHGDVIROORZV7KHSHUIRUPDQFHRI*00PRGHOVZLWK GLIIHUHQWQXPEHUVRIFRPSRQHQWVZDVHYDOXDWHGRYHUWKHWHVW VHWDQGWKHPRGHOZLWKWKHORZHVWHUURUZDVVHOHFWHG $V LOOXVWUDWHG LQ )LJXUH WKH IHPDOH WR PDOH GLUHFWLRQ UHTXLUHV FRPSRQHQWV ZKLOH FRPSRQHQWV DUH QHHGHG WR FRQYHUWWKHPDOHLQWRIHPDOHYRLFH7KHPHDQVTXDUHGHUURU 06( ILJXUHVDUHEDVHGRQWKHGHILQLWLRQJLYHQLQ>@ (YHQ WKRXJK WKH *00 ZDV WXQHG GLUHFWO\ RQ WKH WHVW VHWDVLPLODUWXQLQJFRXOGEHSHUIRUPHGE\FURVVYDOLGDWLRQ XVLQJRQO\WKHWUDLQLQJVHW 3URSRVHG0HWKRG/RFDO/LQHDU7UDQVIRUPDWLRQ 7KH WXQLQJ RI WKH SURSRVHG PHWKRG LV PDLQO\ EDVHG RQ SHUFHSWXDO HYDOXDWLRQ $ FRGHERRN VL]H RI ZDV XVHG ZKLOHWKHQHLJKERUKRRGVL]HVZHUHWXQHGVHSDUDWHO\IRUHDFK GLUHFWLRQ OHDGLQJ WR YDOXHV RI IHPDOH WR PDOH DQG PDOHWRIHPDOH 7KHOLQHDUWUDQVIRUPDWLRQVZHUHUHVWULFWHG WRWULGLDJRQDOPDWULFHV 7KH QHLJKERUKRRG VL]H ZDV IRXQG WR DFW DV D WUDGHRII SURGXFLQJ XQVWDEOH UHVXOWV ZKHQ WKH QHLJKERUKRRG LV WRR VPDOO DQG H[FHVVLYHO\ DYHUDJHG RYHUVPRRWKHG UHVXOWV ZKHQQHLJKERUKRRGVDUHODUJH
6XEMHFWLYH/LVWHQLQJ7HVW 7KHVSHHFKVDPSOHVHYDOXDWHGLQWKHOLVWHQLQJWHVWVDUHEDVHG RQ WDUJHW VSHDNHU YHUVLRQV RI WKH WHVW XWWHUDQFHV LQ ZKRVH SDUDPHWULF UHSUHVHQWDWLRQV RQO\ /6)V KDYH EHHQ UHSODFHG ZLWK FRQYHUWHG RQHV 7KLV PLPLFV WKH FDVH ZKHQ DOO RWKHU IHDWXUHVDUHLGHDOO\FRQYHUWHGIRFXVLQJWKHHYDOXDWLRQRQWKH DFWXDOVSHFWUDOFRQYHUVLRQ MSE over different GMM components − female to male −
4
3.8
x 10
MSE
min
MSE
(;3(5,0(176
−> 8 components
3.6 3.4 3.2 1
2 4
4.2
x 10
4
8 16 32 GMM components MSE over different GMM components − male to female − MSE
min
MSE
64
128
−> 16 components
4 3.8 3.6 1
2
4
8 16 GMM components
32
64
128
)LJXUH 0HDQ VTXDUHG HUURU RI *00V IRU GLIIHUHQW QXPEHUVRIFRPSRQHQWVPHDVXUHGRYHUWKHWHVWVHW )RUHDFKFRQYHUVLRQGLUHFWLRQDPRGLILHG026WHVWZDV FDUULHG RXW E\ WHQ OLVWHQHUV RQ WHQ WHVW VHQWHQFHV 7KH SURSRVHGPHWKRG//7 DQGWKH*00EDVHGDSSURDFKZHUH FRPSDUHGLQWHUPVRIVSHHFKTXDOLW\DQGVXFFHVVRILGHQWLW\ PDSSLQJ7KHVHFULWHULDDUHHYDOXDWHGZLWKVFRUHVEHWZHHQ DQG ZLWK LQGLFDWLQJ WKDW ³*00//7 SHUIRUPV PXFKEHWWHU´IRU³*00//7SHUIRUPVEHWWHU´DQG LQGLFDWLQJSHUFHSWXDOO\LGHQWLFDOSHUIRUPDQFH7KHUHVXOWVRI WKHOLVWHQLQJWHVWDUHLOOXVWUDWHGLQ7DEOH 7DEOH6XEMHFWLYHOLVWHQLQJWHVWVFRUHVZLWK FRQILGHQFHLQWHUYDOV 4XDOLW\ ,GHQWLW\ )HPDOHWRPDOH 0DOHWRIHPDOH $ SRVVLEOH H[SODQDWLRQ IRU WKH PDOH WR IHPDOH UHVXOW LV WKDWWKHKLJKSLWFKHGIHPDOHYRLFHVHHPVWRPDVNWKHTXDOLW\ SUREOHPVPDNLQJWKHWZRPHWKRGVVRXQGPRUHVLPLODU 7KHVXEMHFWLYHVFRUHVLQGLFDWHWKHJHQHUDOSUHIHUHQFHRI WKHSURSRVHGDSSURDFKRYHUWKH*00EDVHGV\VWHP 2YHU6PRRWKLQJ5HGXFWLRQ 7KHFRQYHUWHGVSHFWUDDQG/6)WUDFNVLOOXVWUDWHGLQ)LJXUH LQGLFDWHDUHGXFWLRQRIWKHRYHUVPRRWKLQJLQWKHFDVHRIWKH SURSRVHGDSSURDFKLQFRPSDULVRQWR*00
4519
$&.12:/('*(0(17 7KLV ZRUN ZDV VXSSRUWHG E\ WKH $FDGHP\ RI )LQODQG DSSOLFDWLRQ QXPEHU )LQQLVK 3URJUDPPH IRU &HQWUHVRI([FHOOHQFHLQ5HVHDUFK 5()(5(1&(6
Converted spectral envelopes Magnitude (dB)
20
0 −10 −20 0
Frequency (Hz)
Proposed (LLT) GMM
10
1000
2000
3000 4000 5000 6000 7000 Frequency (Hz) Converted LSF tracks 9 to 14 for a speech segment
8000
Proposed (LLT) GMM
6000 5000 4000 0
50
Frame time index
100
150
)LJXUH 2YHUVPRRWKLQJ UHGXFWLRQ IRU VSHFWUDO HQYHORSHVWRS DQG/6)WUDFNVERWWRP 6WDQGDUG GHYLDWLRQ PHDVXUHPHQWV RI FRQYHUWHG DQG RULJLQDO WDUJHW VSHFWUD LQ IUHTXHQF\ DQG /6) WUDFNV LQ WLPH DUHFDOFXODWHGRYHUWKHHQWLUHWHVWVHWDQGVXPPDUL]HG LQ7DEOHFRQILUPLQJWKHRYHUVPRRWKLQJUHGXFWLRQ 7DEOH$YHUDJHVWDQGDUGGHYLDWLRQRIVSHFWUDO PDJQLWXGHLQG% DQG/6)WUDFNVLQ+] 0DJQLWXGHG% /6)WUDFNV+] 3URSRVHG *00 7JW 3URSRVHG *00 7JW )HPDOHWR PDOH 0DOHWR IHPDOH /RFDO PRGHOLQJ UDWKHU WKDQ WKH LQWHUSRODWLRQ RI ORFDO PRGHOV IURP DFRXVWLFDOO\ UHPRWH UHJLRQV PDNHV WKH SURSRVHG DSSURDFK FDSDEOH WR FDSWXUH GHWDLOV EHWWHU DQG UHGXFHWKHDYHUDJLQJHIIHFW &21&/86,216 7KLVDUWLFOHLQWURGXFHGDQHZPHWKRGIRUVSHFWUDOFRQYHUVLRQ EDVHGRQORFDOUHJUHVVLRQ/LQHDUWUDQVIRUPDWLRQPRGHOVDUH ILWHYHU\WLPHWRORFDOGDWDIRUHDFKVRXUFHYHFWRUDVRSSRVHG WRWKHW\SLFDOLQWHUSRODWLRQRIOLQHDUPRGHOV7KHPHWKRGZDV VKRZQ WR HIIHFWLYHO\ UHGXFH RYHUVPRRWKLQJ DQG REWDLQHG IDYRUDEOH SUHIHUHQFH VFRUHV LQ D VXEMHFWLYH HYDOXDWLRQ DJDLQVWWKHSRSXODU*00EDVHGDSSURDFK 2Q WKH GRZQVLGH WKH SURSRVHG PHWKRG XVHV KHDYLHU FRPSXWDWLRQIRUFRQYHUVLRQDVOLQHDUWUDQVIRUPDWLRQVGHSHQG RQWKHLQSXWYHFWRUDQGKDYHWREHHVWLPDWHGDWUXQWLPH ,QWHUHVWLQJGLUHFWLRQVIRUIXWXUHZRUNZRXOGEHWRVWXG\ DOWHUQDWLYH ZD\V IRU QHLJKERUKRRG VHOHFWLRQ DQG DOWHUQDWLYH ORFDOPRGHOV
4520
>@ / $UVODQ DQG ' 7DONLQ ³9RLFH &RQYHUVLRQ E\ &RGHERRN 0DSSLQJ RI /LQH 6SHFWUDO )UHTXHQFLHV DQG ([FLWDWLRQ 6SHFWUXP´ ,Q WK 3URFHHGLQJV RI (XURSHDQ &RQIHUHQFH RQ 6SHHFK &RPPXQLFDWLRQDQG7HFKQRORJ\5KRGHV*UHHFH >@ $ .DLQ DQG 0: 0DFRQ ³6SHFWUDO 9RLFH &RQYHUVLRQ IRU 7H[WWR6SHHFK 6\QWKHVLV´ ,Q 3URFHHGLQJV RI ,QWHUQDWLRQDO &RQIHUHQFH RQ $FRXVWLFV 6SHHFK DQG 6LJQDO 3URFHVVLQJ 6HDWWOH 86$9ROSS >@=6KXDQJ5%DNLVDQG@ 0 1DUHQGUDQDWK + 0XUWK\ 6 5DMHQGUDQ DQG 1 @ @ 93RSD-1XUPLQHQDQG0*DEERXM$6WXG\RI%LOLQHDU 0RGHOVLQ9RLFH&RQYHUVLRQ-RXUQDORI6LJQDODQG,QIRUPDWLRQ 3URFHVVLQJYROQR0D\ >@ 7 7RGD @ 7 7RGD $: %ODFN DQG . 7RNXGD ³9RLFH &RQYHUVLRQ %DVHG RQ 0D[LPXP/LNHOLKRRG (VWLPDWLRQ RI 6SHFWUDO 3DUDPHWHU 7UDMHFWRU\´ ,((( 7UDQVDFWLRQV RQ $XGLR 6SHHFK DQG /DQJXDJH 3URFHVVLQJ9ROXPH,VVXHSS1RY >@(+HODQGHU79LUWDQHQ-1XUPLQHQDQG0*DEERXM³9RLFH &RQYHUVLRQ8VLQJ3DUWLDO/HDVW6TXDUHV5HJUHVVLRQ´,(((7UDQV RQ6SHHFKDQG$XGLR3URFHVVLQJYROQRSS-XO\ >@ +