Conditional Independence Based Learning of Bayesian Classifiers Guided by a Variable Ordering .... structure is associated to a conditional probability table.
&RQGLWLRQDO,QGHSHQGHQFH%DVHG/HDUQLQJRI%D\HVLDQ&ODVVLILHUV*XLGHGE\ D9DULDEOH2UGHULQJ*HQHWLF6HDUFK (GLPLOVRQ%DWLVWDGRV6DQWRV(VWHYDP5DIDHO+UXVFKND-XQLRU0DULDGR&DUPR1LFROHWWL '&8)6&DU±)HGHUDO8QLYHUVLW\RI6mR&DUORV%UD]LO ^HGLPLOVRQBVDQWRVHVWHYDPFDUPR`#GFXIVFDUEU $EVWUDFW 7KLVZRUNSURSRVHVDJHQHWLFVWUDWHJ\IRUOHDUQLQJD %D\HVLDQ FODVVLILHU XVLQJ DQ DOJRULWKP EDVHG RQ FRQGLWLRQDOLQGHSHQGHQFHDQGWKHLQIRUPDWLRQJLYHQE\ D YDULDEOH RUGHULQJ 7KH VWUDWHJ\ KDV EHHQ LPSOHPHQWHG DV WKH V\VWHP 92*$&3& 7KH SDSHU SUHVHQWV DQG DQDO\VHV WKH UHVXOWV RI H[SHULPHQWV LQ YDULRXV GRPDLQV XVLQJ 92*$&3& DV ZHOO DV D SUHYLRXVV\VWHPQDPHG92*$.EDVHGRQDOJRULWKP .
,QWURGXFWLRQ
,Q WKH ODVW GHFDGH PDQ\ DOJRULWKPV IRU OHDUQLQJ %D\HVLDQ 1HWZRUNV %1V DQG %D\HVLDQ &ODVVLILHUV %&V IURP GDWD KDYH EHHQ SURSRVHG LQ WKH OLWHUDWXUH >@,WLVZHOONQRZQWKDWWKHVHDUFKVSDFHIRUOHDUQLQJ D %1 KDV H[SRQHQWLDO GLPHQVLRQ RQ WKH QXPEHU RI YDULDEOHV LQYROYHG 7KHUHIRUH WKH VHDUFK IRU D %1 VWUXFWXUH WKDW EHVW UHSUHVHQWV WKH GHSHQGHQFHV DPRQJ WKH YDULDEOHV LV QRW WULYLDO VLQFH LW LV D 13FRPSOHWH WDVN >@ ,Q RUGHU WR UHGXFH WKH VHDUFK VSDFH D IHZ UHVWULFWLRQV FDQ EH LPSRVHG ZKLFK ZRXOG DOORZ VRPH DOJRULWKPV WR UHDFK JRRG UHVXOWV ZLWK DFFHSWDEOH FRPSXWDWLRQDOHIIRUW $OJRULWKPV WKDW OHDUQ %1V IURP GDWD FDQ EH JURXSHGLQWRWKRVHWKDWFRQGXFWDKHXULVWLFVHDUFKDQG WKRVHEDVHGRQYDULDEOHFRQGLWLRQDOLQGHSHQGHQFH&, $OJRULWKPV WKDW FRQGXFW D KHXULVWLF VHDUFK FRPPRQO\ UHTXLUH D YDULDEOH RUGHULQJ 92 LH D UHOHYDQFH EDVHGRUGHUHGOLVWRIYDULDEOHV $OWKRXJK D 92 OLVW LV QRW UHTXLUHG E\ &, EDVHG DOJRULWKPV LW FDQ EH IRXQG LQ WKH OLWHUDWXUH UHVXOWV VKRZLQJ WKDW WKHLQIRUPDWLRQ JLYHQ E\ D 92 FDQKHOS WRLPSURYHWKHSHUIRUPDQFHRID&,OHDUQLQJDOJRULWKP VHH>@>@ ,Q JHQHUDO JHQHWLF DOJRULWKPV *$ DUH FDSDEOH WR LGHQWLI\ DQG H[SORUH DVSHFWV RI WKH VHDUFK VSDFH RI D SUREOHP DQG WR FRQYHUJH JOREDOO\ WR D JRRG VROXWLRQ 7KHUHIRUH JHQHWLF DOJRULWKPV DUH FRQVLGHUHG DQ HIILFLHQW VHDUFK DQG RSWLPL]DWLRQ PHFKDQLVP VXLWDEOH
WREHXVHGIRUVROYLQJPDQ\GLIIHUHQWSUREOHPV>@,Q WKLV SDSHU ZH SURSRVH D VWUDWHJ\ WKDW XVHV D JHQHWLF DOJRULWKP WR RSWLPL]H WKH LQGXFWLRQ RI %&V IURP GDWD XVLQJ D K\EULG DOJRULWKP EDVHG RQ ERWK FRQGLWLRQDO LQGHSHQGHQFHDQGYDULDEOHRUGHULQJ 6HYHUDO ZRUNV WKDW SURSRVH K\EULG *$%D\HV PHWKRGV JHQHUDOO\ IRFXV RQ WKH XVH RI D *$ WR RSWLPL]H WKH %1 OHDUQLQJ SURFHVV ,Q >@ D *$ LV XVHGIRUVHDUFKLQJIRUWKHEHVWYDULDEOHRUGHULQJZKHQ OHDUQLQJ D %1 XVLQJ DV ILWQHVV PHDVXUH WKH J IXQFWLRQ RI WKH . DOJRULWKP >@ 7KH DSSURDFK SURSRVHG E\ +VX >@ LPSOHPHQWV D *$ IRU WKH SUREOHP RI SHUPXWDWLRQRIYDULDEOHVLQ%1OHDUQLQJDQGLQIHUHQFH &DPSRVDQG+XHWH>@SURSRVHDPHWKRGWKDWFRQVLGHUV D VXEJURXS RI WKH VHW RI GHSHQGHQFHLQGHSHQGHQFH UHODWLRQVLQRUGHUWRLGHQWLI\WKHYDULDEOHRUGHULQJDQG VXEVHTXHQWO\ XVH LW WR OHDUQ WKH QHWZRUN VWUXFWXUH 2WKHU GLIIHUHQW SURSRVDOV LPSOHPHQWLQJ D *$%D\HV FROODERUDWLRQFDQEHIRXQGLQWKHOLWHUDWXUHVXFKDVWKH RQHV GHVFULEHG LQ >@ DQG >@ DQG WKH UHIHUHQFHV WKHUHLQPRVWRIWKHPKRZHYHUDLPDWWKHOHDUQLQJRI XQUHVWULFWHG %1 EDVHG RQ DKHXULVWLF VHDUFK 7KH *$ %D\HVK\EULGDSSURDFKSURSRVHGDQGGHVFULEHGLQWKLV SDSHU KRZHYHU LV GHYRWHG WR OHDUQ %D\HVLDQ &ODVVLILHUV IURP GDWD XVLQJ &, OHDUQLQJ DOJRULWKPV ,Q WKLVVHQVHWKHFODVVYDULDEOHSOD\DQLPSRUWDQWUROHLQ WKHYDULDEOHRUGHULQJGHILQLWLRQDQGWKHILWQHVVIXQFWLRQ PD\QRWIROORZDVWUXFWXUHVFRUHEDVHGDSSURDFK 7KHLGHDRIIRFXVLQJRQWKHFODVVYDULDEOHWRGHILQH DQDGHTXDWH92ZDVDOUHDG\ HPSOR\HGLQ>@ZKHUHD KHXULVWLF VHDUFK OHDUQLQJ DOJRULWKP . ZDV XVHG +RZHYHU VLQFH WKH OHDUQLQJ DOJRULWKP LV EDVHG RQ &, WHVWV WKH *$ ILWQHVV IXQFWLRQ FDQQRW EH GHILQHG DV LQ >@ 7KXV LQ WKH ZRUN KHUHLQ GHVFULEHG ZH XVH WKH FRUUHFWFODVVLILFDWLRQUDWH&&5 DVWKHILWQHVVIXQFWLRQ 7KHUHPDLQGHURIWKLVSDSHULVRUJDQL]HGDVIROORZV 6HFWLRQ EULHIO\ UHFDOOV VRPH FRQFHSWV UHODWHG WR %D\HVLDQ 1HWZRUNV DQG %D\HVLDQ &ODVVLILHUV ,Q 6HFWLRQZHGHVFULEHWKHEDVLFVRIWKH3&DOJRULWKPD YDULDEOH FRQGLWLRQDO LQGHSHQGHQFH EDVHG DOJRULWKP VXLWDEOH IRU OHDUQLQJ D %1 6HFWLRQ GHVFULEHV WKH V\VWHP 92*$&3& ZKLFK LPSOHPHQWV D JHQHWLF DOJRULWKPVHDUFKIRULGHQWLI\LQJDQRSWLPDO92DQGWKH
1444 c 1-4244-1340-0/07/$25.00 2007 IEEE
FRUUHVSRQGLQJ OHDUQHG %& 7KH *$ILWQHVV IXQFWLRQ WKDWLPSOHPHQWVWKH92HYDOXDWLRQSURFHVVLVEDVHGRQ WKH SHUIRUPDQFH RI WKH %& OHDUQHG XVLQJ WKH 3& DOJRULWKP KDYLQJ DV LQSXW D WUDLQLQJ VHW DQG WKH LQIRUPDWLRQJLYHQE\WKH92EHLQJHYDOXDWHG6HFWLRQ VKRZV VRPH SUHOLPLQDU\ H[SHULPHQWV ZLWK WKH SURSRVHG PHWKRG DQG FRPSDUHV LWV UHVXOWV YHUVXV WKH UHVXOWV REWDLQHG ZLWK D SUHYLRXV SURSRVDO QDPHG 92*$. SUHVHQWHG LQ >@ )LQDOO\ 6HFWLRQ SUHVHQWVVRPHFRQFOXGLQJUHPDUNVDQGSRLQWVRXWVRPH IXWXUHZRUN
%D\HVLDQ 1HWZRUNV DQG %D\HVLDQ &ODVVLILHUV $ %D\HVLDQ 1HWZRUN %1 >@ * KDV D GLUHFWHG DF\FOLFJUDSK'$* VWUXFWXUH(DFKQRGHLQWKHJUDSK FRUUHVSRQGV WR D GLVFUHWH UDQGRP YDULDEOH LQ WKH GRPDLQ $Q HGJH @ LV EDVHG XSRQ VWDWLVWLFDO FRQGLWLRQDO LQGHSHQGHQFH WHVWV ,W ZRUNV ORRNLQJ IRU D %D\HVLDQ 1HWZRUN WKDW UHSUHVHQWV WKH LQGHSHQGHQFH UHODWLRQVKLSDPRQJYDULDEOHVLQDGDWDVHW7KLVLVGRQH EDVHG RQ WKH FRQGLWLRQDO LQGHSHQGHQFH FULWHULD ,;L;M_$ GHILQHG LQ >@ ZKHUH $ LV D VXEVHW RI
YDULDEOHV;LDQG;MDUHYDULDEOHV,I,;L;M_$ LVWUXH YDULDEOH;LLVFRQGLWLRQDOO\LQGHSHQGHQWRI;MJLYHQ$ GVHSDUDWLRQFULWHULRQ 7R YHULI\ ZKHWKHU ; DQG @ 7KH PDLQ VWHSV RI WKH 3& DOJRULWKP FDQ EH VXPPDUL]HGDVLQ)LJXUH )RUHDFKSDLURIYDULDEOHVWHVWIRUWKHLU FRQGLWLRQDOLQGHSHQGHQFH %DVHGRQWKHFRQGLWLRQDOLQGHSHQGHQFHUHVXOWV FRQVWUXFWWKHVNHOHWRQ6 RIWKHJUDSK ,GHQWLI\WKHRULHQWDWLRQRIWKHHGJHVLQ6 Fig. 1. A high level description of the PC algorithm +DYLQJ DV LQSXW D OLVW ZLWK DOO WKH LQGHSHQGHQFLHV ,;L;M_$ DQGDGMDFHQFLHVRIHDFKQRGH$'-;L 3& ILUVW ILQGV WKH JUDSK VNHOHWRQ XQGLUHFWHG JUDSK WKDW EHVW UHSUHVHQWV WKH GVHSDUDWLRQV H[SUHVVHG E\ ,;L;M_$ $IWHUZDUGV LW VWDUWV HVWDEOLVKLQJ WKH RULHQWDWLRQRIWKHHGJHV $VVWDWHGLQ>@³LIWKHSRSXODWLRQIURPZKLFKWKH VDPSOHLQSXWZDVGUDZQSHUIHFWO\ILWVD'$*&DOORI ZKRVH YDULDEOHV KDYH EHHQ PHDVXUHG DQG WKH SRSXODWLRQ GLVWULEXWLRQ 3 FRQWDLQV QR FRQGLWLRQDO LQGHSHQGHQFHH[FHSWWKRVHHQWDLOHGE\WKHIDFWRUL]DWLRQ RI3DFFRUGLQJWR&WKHQLQWKHODUJHVDPSOHOLPLWWKH 3&DOJRULWKPSURGXFHVWKHWUXHSDWWHUQ´SUHVHQWLQWKH GDWD 7KHDWWULEXWHSUHRUGHUDVVXPSWLRQFDQEHXVHGLQWKH 3& HGJH RULHQWDWLRQ VWHS VWHS RI )LJ 7KXV DQ RUGHUHG OLVWFRQWDLQLQJ DOO WKH DWWULEXWHV LQFOXGLQJWKH FODVV DVVHUWVWKDWRQO\WKHDWWULEXWHVSRVLWLRQHGEHIRUH DJLYHQDWWULEXWH;PD\EHSDUHQWVRI;7KHUHIRUHWKH XVHRIDSUHGHILQHG92FDQVNLSWKHVHDUFKIRUDQHGJH RULHQWDWLRQDVFRQGXFWHGLQVWHSRI)LJ
7KH92*$&3&6\VWHP 7KH92*$&3&DSSURDFKFRPELQHVLQRQHV\VWHPWKH VHDUFK IRU D µJRRG¶ YDULDEOH RUGHULQJ 92 DQG WKH FRUUHVSRQGLQJ%&OHDUQHGXVLQJWKH3&DOJRULWKPDQG WKH92 7KH SURFHVV RI VHDUFKLQJ IRU D 92 DQG WKH UHVXOWLQJ %& LV LPSOHPHQWHG DV D JHQHWLF DOJRULWKP *$ ZKLFKVHDUFKHVWKHYDULDEOHVSDFHWU\LQJWRILQG D µJRRG¶ RUGHULQJ DPRQJ WKHP ± WKLV SURFHVV LV LGHQWLILHG E\ WKH DFURQ\P 92*$ (DFK LQGLYLGXDO RI WKH *$ SRSXODWLRQ UHSUHVHQWV D SRVVLEOH RUGHULQJ %\
2007 IEEE Congress on Evolutionary Computation (CEC 2007)
1445
GHIDXOW DV WKH FODVV LV DOZD\V WKH ILUVW YDULDEOH LQ WKH 92 WKHUH LV QR QHHG WR UHSUHVHQW LW DV SDUW RI DQ LQGLYLGXDO 7KH YDULDEOH LGHQWLILFDWLRQ ,' LV FRGHG DV DQ LQWHJHUQXPEHU7KHUHIRUH HDFK FKURPRVRPH KDV Q JHQHV ZKHUH Q LV WKH WRWDO QXPEHU RI YDULDEOHV LQFOXGLQJ WKH FODVV YDULDEOH DQG HDFK JHQH LV UHSUHVHQWHG E\ DQ LQWHJHU WKDW FRUUHVSRQGV WR D YDULDEOH,' 7KXVDQ\SHUPXWDWLRQIURPWRQLVD SRWHQWLDO FKURPRVRPH )RU H[DPSOH FRQVLGHU D GRPDLQZLWKYDULDEOHV99DQG9DQGDFODVV ,QWKLVFDVHWKHUHDUHSRVVLEOHYDULDEOHRUGHULQJVDQG WKHUHIRUH SRVVLEOH FKURPRVRPHV DV VKRZHG LQ )LJXUH
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
92V
&KURPRVRPHV
Fig. 2. Possible VOs and how they are represented as chromosomes for a domain with 3 variables (V1, V2 and V3).
$V FDQ EH VHHQ LQ WKH IORZFKDUW LQ )LJ WKH 92*$&3&H[SHFWVDVLQSXWD7UDLQLQJVHW7U DQGD 7HVWLQJVHW7H %HIRUHVWDUWLQJWKHVHDUFKIRUWKHEHVW 9292*$&3&H[HFXWHVWKH3&DOJRULWKPRQO\RQFH WR REWDLQ WKH %& VNHOHWRQ XQGLUHFWHG JUDSK 6XEVHTXHQWO\ WKH *$ VWDUWV WKH VHDUFK SURFHVV E\ JHQHUDWLQJ D UDQGRP LQLWLDO SRSXODWLRQ 3RS &RQVLGHULQJ _92_ WKH QXPEHU RI SRVVLEOH 9DULDEOH 2UGHULQJV DQG _9_ WKH QXPEHU RI YDULDEOHV LQ WKH GRPDLQWKHVL]HRIWKHLQLWLDOSRSXODWLRQ_3RS _LVJLYHQ E\ WKH UXOH GHVFULEHG LQ ZKLFK ZDV HPSLULFDOO\ FRQVWUXFWHG
1446
LI_9_ WKHQVL]H3RS HOVHLI_92_ ! WKHQVL]H3RS HOVHVL]H3RS _92_
,1387 7UWUDLQLQJVHW 7HWHVWLQJVHW
, 3RS ^92921` OHDUQBXVLQJB3&7U6NHOHWRQ IRU. WR1GRHYDOXDWH92.6NHOHWRQ7H(YDO.%&. HYDOXDWH926NHOHWRQ7H(YDO%& FUHDWHB%&926NHOHWRQ%& FKHFNBSHUIRUPDQFH%&7H(YDO 3RS, VHOHFW^92 .(YDO. «92.(YDO. ` 3RS, FURVVRYHU3RS, 3RS, PXWDWLRQ3RS, IRU. WR1GRHYDOXDWH92.6NHOHWRQ7H(YDO.%&. 1 6WRSSLQJFRQGLWLRQ , , VDWLVILHG" < %HVWB92 PD[(YDO^92(YDO%& «921(YDO1%&1 ` %HVWB%& PD[(YDO^92(YDO%& «921(YDO1%&1 ` 5HWXUQ %HVWB92%HVWB%& Fig 3. VOGAC-PC flowchart: the GA searches for a ‘good’ VO and the corresponding learned BC
(DFK FKURPRVRPH LH HDFK 92 LV XVHG LQ FRQMXQFWLRQZLWKWKH%&VNHOHWRQWRFUHDWHDFRPSOHWH %& VNHOHWRQ HGJH GLUHFWLRQV &37 ZKLFK LV HYDOXDWHGE\DILWQHVVIXQFWLRQWKDWUHWXUQVWKHDYHUDJH SHUIRUPDQFH (YDO RI WKH UHVXOWLQJ %& LQ D IROG FURVVYDOLGDWLRQVWUDWHJ\ 7KH EHVW FKURPRVRPHV DUH WKHQ VHOHFWHG WRXUQDPHQW VHOHFWLRQ DQG XVLQJ FURVVRYHU DQG PXWDWLRQ RSHUDWRUV WKH QH[W JHQHUDWLRQ LV EXLOW 7KH
2007 IEEE Congress on Evolutionary Computation (CEC 2007)
SURFHVV LV UHSHDWHG DQG LQ HDFK JHQHUDWLRQ WKH EHVW RUGHULQJLVNHSWDQGSDVVHGRQWRWKHQH[W,IWKHUHLVQR LPSURYHPHQW DIWHU JHQHUDWLRQV WKH DOJRULWKP HQGV DQGUHWXUQVWKHEHVWRUGHULQJ%HVWB92 IRXQGDVZHOO DVWKHFRUUHVSRQGLQJOHDUQHG%&%HVWB%& ,QDGGLWLRQWRWKH92*$&3&DOJRULWKPDVOLJKWO\ GLIIHUHQW YHUVLRQ QDPHO\ 92*$&3& ZDV DOVR LPSOHPHQWHG,Q92*$&3&WKHLQLWLDOSRSXODWLRQLV QRW UDQGRPO\ JHQHUDWHG DQG PRUH LQIRUPDWLRQ DERXW WKHFODVVYDULDEOHLVXVHGLQDQDWWHPSWWRRSWLPL]HWKH LQLWLDOSRSXODWLRQ ,Q 92*$&3& LQ RUGHU WR GHILQH WKH 92 RI WKH LQLWLDO SRSXODWLRQ FKURPRVRPHV WKH χ FKLVTXDUHG VWDWLVWLFDOWHVWLVSHUIRUPHGXVLQJHDFKYDULDEOHMRLQWO\ ZLWK WKH FODVV YDULDEOH IRU WKLV UHDVRQ 92*$&3& FDQ RQO\ EH DSSOLHG LQ D FODVVLILFDWLRQ FRQWH[W ZKHUH WKHUH LV D GLVWLQJXLVKHG YDULDEOH QDPHO\ WKH FODVV YDULDEOH 7KXV WKH VWUHQJWK RI WKH GHSHQGHQFH UHODWLRQVKLSEHWZHHQHDFKYDULDEOHDQGWKHFODVVFDQEH PHDVXUHG6XEVHTXHQWO\WKHYDULDEOHVDUHGHFUHDVLQJO\ RUGHUHGDFFRUGLQJWRWKHLUχVFRUHV7KHILUVWYDULDEOH LQWKHRUGHUHGOLVWKDVWKHKLJKHVWχVFRUHLHLWLVWKH PRVWGHSHQGHQWXSRQWKHFODVV2EYLRXVO\WKHUHODWLRQ EHWZHHQWKHχVWDWLVWLFDOWHVWDQGWKHEHVW92PD\QRW KROGVWULFWO\SUHYLRXVZRUNV>@DQG>@DVZHOODVWKH H[SHULPHQWV GHVFULEHG LQ WKH 6HFWLRQ RI WKLV SDSHU KRZHYHUVKRZWKDWJRRGUHVXOWVFDQEHDFKLHYHGXVLQJ WKLVKHXULVWLF ,Q 92*$&3& KDYLQJ GHILQHG WKH 92 JLYHQ E\ χ VWDWLVWLFDO WHVW DOO LQLWLDO SRSXODWLRQ FKURPRVRPHV DUH GHILQHG XVLQJ WKLV 92 DOO FKURPRVRPHV DUH LGHQWLFDOLQ3RS
FODVV YDULDEOH 7DEOH VXPPDUL]HV WKH GRPDLQV FKDUDFWHULVWLFV
6\QWK Fig 4. Bayesian Networks representing Synth 1 and Synth 2 domains. The graphical representations were created using GeNie Software [7].
Table 1. Domain description where: NA (Number of Attributes plus Class), NI (Number of Instances) and NC (Number of Classes). $VLD (QJLQH ,ULV %DODQFH 9RWLQJ 6\QWK 6\QWK 1$ 1, 1&
,QWHQGLQJ WR SHUIRUP D PRUH UREXVW FRPSDUDWLYH DQDO\VLV EHVLGHV SUHVHQWLQJ UHVXOWV REWDLQHG XVLQJ 92*$&3&DQG92*$&3&WKLVVHFWLRQVKRZVWKH SHUIRUPDQFHRIWKHWUDGLWLRQDO3&DQG.DOJRULWKPVDV ZHOO DV WKH 92*$. DQG 92*$. PHWKRGV >@ XVLQJWKHVHYHQGRPDLQV 7KH H[SHULPHQWV ZHUH FRQGXFWHG DFFRUGLQJ WR WKH IROORZLQJVWHSV ,QLWLDOO\ XVLQJ D IROG FURVVYDOLGDWLRQ VWUDWHJ\ ERWK DOJRULWKPV . DQG 3& ZHUH H[HFXWHG XVLQJ WKH RULJLQDO 92 RI HDFK GDWDVHW DQGWKHREWDLQHG$YHUDJH&RUUHFW&ODVVLILFDWLRQ 5DWHV$&&5V ZHUHVWRUHG 7KH VDPH GDWDVHWV XVHG LQ VWHS ZHUH XVHG ZLWK92*$.92*$&3&92*$.DQG 92*$&3& 7KH QXPEHU RI JHQHUDWLRQV QHFHVVDU\WRUHDFKWKHVROXWLRQZDVVWRUHG
([SHULPHQWVDQG5HVXOWV
7KLV VHFWLRQ LQLWLDOO\ GHVFULEHV WKH NQRZOHGJH GRPDLQVDQGWKHJHQHUDOVWUXFWXUHRIWKHV\VWHPXVHGLQ WKH H[SHULPHQWV DV ZHOO DV WKH H[SHULPHQWDO PHWKRGRORJ\ DGRSWHG 7KH UHVXOWV IURP WKH H[SHULPHQWVDUHWKHQSUHVHQWHGDQGDQDO\]HG 6HYHQ GRPDLQV ZHUH XVHG LQ WKH H[SHULPHQWV 7ZR ZHOONQRZQ %D\HVLDQ 1HWZRUN GRPDLQV (QJLQH )XHO 6\VWHP >@ DQG $VLD >@ WKUHH EHQFKPDUN SUREOHPV %DODQFH &RQJUHVVLRQDO 9RWLQJ 5HFRUGV 9RWLQJ DQG ,ULV IURP WKH 8&, 5HSRVLWRU\ >@ DQG WZR DUWLILFLDO GRPDLQV 6\QWK DQG 6\QWK FUHDWHG IRU WKH H[SHULPHQWVGHVFULEHGLQWKLVVHFWLRQ 7KH DUWLILFLDO GRPDLQV 6\QWK DQG 6\QWK ZHUH FUHDWHGXVLQJDVDPSOHVWUDWHJ\DSSOLHGWRWKH%D\HVLDQ &ODVVLILHUV GHVFULEHG LQ )LJXUH 6\QWK GHVFULEHV D GRPDLQLQZKLFKRQO\ RQHYDULDEOHGLUHFWO\LQIOXHQFHV WKH FODVV YDULDEOH 6\QWK KRZHYHU GHVFULEHV D GRPDLQ LQ ZKLFK YDULDEOHV GLUHFWO\ LQIOXHQFH WKH
6\QWK
7KH UHVXOWV DFKLHYHG E\ WKH IRXU 92*$EDVHG V\VWHPVDVZHOODVWKRVH E\3&DQG.DUHSUHVHQWHG LQ 7DEOH 7DEOH VKRZV WKH QXPEHU RI JHQHUDWLRQV XVHGWRUHDFKWKHVROXWLRQE\HDFKRIWKHIRXU92*$ V\VWHPV $QDO\]LQJ WKH UHVXOWV LQ 7DEOH LW LV SRVVLEOH WR LQIHU WKDW DV IDU DV WKH $&&5V YDOXHV DUH FRQFHUQHG 92*$&3& DQG 92*$&3& REWDLQHG WKH EHVW UHVXOWVLQRXWRIGRPDLQV,WLVLQWHUHVWLQJWRQRWLFH KRZHYHU WKDW LQ PRVW RI WKH GRPDLQV $VLD (QJLQH 9RWLQJ DQG 6\QWK DOO DOJRULWKPV KDYH YHU\ FORVH
2007 IEEE Congress on Evolutionary Computation (CEC 2007)
1447
$&&5 YDOXHV PDNLQJ LW GLIILFXOW WR LGHQWLI\ WKH EHVW RQH
Table 2. Average Correct Classification Rates (ACCRs) (10-fold cross-validation). The best results for each domain are in bold face. 'RPDLQ
$VLD (QJLQH ,ULV %DODQFH 9RWLQJ 6\QWK 6\QWK
.
3&
92*$ .
92*$& 92*$ 3& .
92*$& 3&
92*$.VWUXFWXUH Fig 5. Bayesian Classifier structure induced by VOGAK2 and VOGAC-PC using Iris domain. The graphical representations were created using GeNie Software [7].
Table 3. Number of generations needed until convergence.
$QRWKHU DVSHFW WKDW VKRXOG EH PHQWLRQHG LV WKDW 92*$&3& DQG 92*$&3& KDG H[DFWO\ WKH VDPH SHUIRUPDQFH LQ RXW RI GRPDLQV 6LQFH LQ WKHVH GRPDLQV WKH 92V IRXQG XVLQJ 92*$&3& DQG 92*$&3& ZHUH QRW WKH VDPH LW LV SRVVLEOH WR FRQFOXGH WKDW GLIIHUHQW 92V FDQ OHDG WR WKH VDPH $&&5V ,Q DGGLWLRQ 7DEOH UHYHDOV WKDW LQ WKH ,ULV DQG %DODQFH GRPDLQV 92*$&3& DQG 92*$&3& SHUIRUPHG VHQVLEO\ EHWWHU WKDQ DOO WKH RWKHUV LQ WKH %DODQFH GRPDLQ WKLV GLIIHUHQFH LV HYHQ KLJKHU ,Q ,ULV GRPDLQ LW LV SRVVLEOH WR REVHUYH WKDW DOO 3&EDVHG DOJRULWKPV SHUIRUPHG EHWWHU WKDQ WKHLU FRXQWHUSDUWV EDVHG RQ . 7R XQGHUVWDQG WKLV EHKDYLRU LW LV LQWHUHVWLQJWR H[DP WKHQHWZRUN VWUXFWXUHV GHSLFWHG LQ )LJXUH 2EVHUYLQJ )LJXUH LW LV FOHDU WKDW WKH VWUXFWXUHV IRXQG E\ 92*$&3& DQG 92*$. DUH YHU\ GLIIHUHQW%DVHGRQWKLVIDFWLWLVSRVVLEOHWR FRQFOXGH WKDWWKH3&DOJRULWKPSHUIRUPHGEHWWHUWKDQ.LQWKLV GDWDVHW WKH VNHOHWRQ GHILQHG E\ WKH 3& DOJRULWKP SUREDEO\KDGDJUHDWLQIOXHQFHRQWKHUHVXOWV )RFXVLQJ RQ WKH %DODQFH GRPDLQ WKH 92*$&3& DQG WKH 92*$&3& RXWSHUIRUPHG WKHLU . FRXQWHUSDUWVDVZHOODVWKHWUDGLWLRQDO3&7KHUHIRUHLW LV SRVVLEOH WR FRQFOXGH WKDW WKH 92V GHILQHG E\ WKHVH WZR DOJRULWKPV ZHUH UHVSRQVLEOH IRU WKHLU EHWWHU SHUIRUPDQFH $VIDUDVWKHQXPEHURIJHQHUDWLRQVLVFRQFHUQHGLW LV SRVVLEOH WR REVHUYH WKDW WKH 92*$&3& DQG WKH 92*$. GLG QRW UHGXFHG WKH QHFHVVDU\ HIIRUW WR ILQG WKH EHVW 92V %XW LW LV LPSRUWDQW WR VWDWH WKDW LQ JHQHUDOWKH XVH RI DQLQLWLDOSRSXODWLRQ JXLGHG E\ WKH FKLVTXDUHVWDWLVWLFDOWHVWGLGQRWSUHMXGLFHGWKH$&&5V LQ DOO WKH SHUIRUPHG H[SHULPHQWV ,Q 6\QWK GRPDLQ 92*$&3&LPSURYHGWKHREWDLQHG$&&5V
1448
92*$&3&6WUXFWXUH
'RPDLQ 92*$. $VLD (QJLQH ,ULV %DODQFH 9RWLQJ 6\QWK 6\QWK
92*$& 3&
92*$. 92*$&3&
&RQFOXVLRQVDQG)XWXUH:RUN
7KLV SDSHU SURSRVHV WR OHDUQ %D\HVLDQ &ODVVLILHUV XVLQJWKHFRQFHSWRIFRQGLWLRQDOLQGHSHQGHQFHDVZHOO DV WKH LQIRUPDWLRQ JLYHQ E\ D YDULDEOH RUGHULQJ 7KH LGHD KDV EHHQ LPSOHPHQWHG DV WZR V\VWHPV WKH 92*$&3&DQGWKH92*$&3&%RWKV\VWHPVWDNH DGYDQWDJHRIWKHFODVVYDULDEOHNQRZOHGJHWR RSWLPL]H WKH*$%D\HVFROODERUDWLRQ ([SHULPHQWV FRQGXFWHG LQ NQRZOHGJH GRPDLQV UHYHDOHG WKDW WKH SURSRVHG PHWKRGV DUH SURPLVLQJ ,Q DOOSHUIRUPHGH[SHULPHQWV92*$&3&DQG92*$& 3&SURGXFHGPRUHFRQVLVWHQWUHVXOWVZKHQFRPSDUHG ZLWKWUDGLWLRQDO%1OHDUQLQJDOJRULWKPV%DVHGRQWKLV IDFW LW LV SRVVLEOH WR VWDWH WKDW WKH *$%D\HV FROODERUDWLRQ LV IHDVLEOH HYHQ ZKHQ XVLQJ &, OHDUQLQJ DOJRULWKPV :HLQWHQGWRSURFHHGDORQJWKLVOLQHRILQYHVWLJDWLRQ E\ H[SHULPHQWLQJ ZLWK GLIIHUHQW *$ SDUDPHWHUV GLIIHUHQWIHDWXUHUDQNLQJPHWULFVDQGGDWDGRPDLQVZLWK PRUHDWWULEXWHV
$FNQRZOHGJHPHQWV $XWKRUV DFNQRZOHGJH WKH %UD]LOLDQ UHVHDUFK DJHQFLHV )$3(63DQG&13TIRUWKHLUILQDQFLDOVXSSRUW
5HIHUHQFHV
>@%ODNH & / DQG 0HU] & - 8&, 5HSRVLWRU\ RI 0DFKLQH/HDUQLQJ'DWDEDVHV,UYLQH&$8QLYHUVLW\
2007 IEEE Congress on Evolutionary Computation (CEC 2007)
RI &DOLIRUQLD 'HSDUWPHQW RI ,QIRUPDWLRQ DQG &RPSXWHU6FLHQFH >@&DPSRV / 0 DQG +XHWH - ) $SSUR[LPDWLQJ FDXVDORUGHULQJVIRU%D\HVLDQQHWZRUNVXVLQJJHQHWLF DOJRULWKPV DQG VLPXODWHG DQQHDOLQJ 3URF RI WKH (LJKWK,308&RQISS >@&KLFNHULQJ ' 0 DQG 0HHN & 2Q WKH LQFRPSDWLELOLW\ RI IDLWKIXOQHVV DQG PRQRWRQH '$* IDLWKIXOQHVV $UWLILFLDO ,QWHOOLJHQFH SS ± >@&KLFNHULQJ ' 0 *HLJHU ' +HFNHUPDQ '( /HDUQLQJ %D\HVLDQ QHWZRUNV LV 13KDUG 0LFURVRIW 5HV7HFK5HS06575 >@&RRSHU * DQG +HUVNRYLW] ( $ %D\HVLDQ PHWKRG IRUWKHLQGXFWLRQRISUREDELOLVWLFQHWZRUNVIURPGDWD 0DFKLQH/HDUQLQJSS >@GRV 6DQWRV ( % DQG +UXVFKND -U ( 5 92*$ YDULDEOH RUGHULQJ JHQHWLF DOJRULWKP IRU OHDUQLQJ %D\HVLDQ FODVVLILHUV 6L[WK ,QWHUQDWLRQDO &RQIHUHQFH RQ +\EULG ,QWHOOLJHQW 6\VWHPV +,6 SS >@'UX]G]HO 0 - 60,/( 6WUXFWXUDO 0RGHOLQJ ,QIHUHQFH DQG /HDUQLQJ (QJLQH DQG *H1,H $ GHYHORSPHQW HQYLURQPHQW IRU JUDSKLFDO GHFLVLRQ WKHRUHWLF PRGHOV ,QWHOOLJHQW 6\VWHPV 'HPRQVWUDWLRQ ,Q 3URFHHGLQJV RI WKH 6L[WHHQWK 1DWLRQDO &RQIHUHQFH RQ $UWLILFLDO ,QWHOOLJHQFH $$$, SS $$$, 3UHVV7KH 0,7 3UHVV0HQOR3DUN&$ >@*ROGEHUJ ' ( *HQHWLF DOJRULWKPV LQ VHDUFK RSWLPL]DWLRQ DQG PDFKLQH OHDUQLQJ $GGLVRQ :HVOH\ >@+UXVFKND-U(5DQG(EHFNHQ1))7RZDUGV HIILFLHQW YDULDEOHV RUGHULQJ IRU %D\HVLDQ QHWZRUNV FODVVLILHU 'DWD DQG .QRZOHGJH (QJLQHHULQJ GRLMGDWDN >@ +UXVFKND-U(5(EHFNHQ1)) 9DULDEOH RUGHULQJ IRU ED\HVLDQ QHWZRUNV OHDUQLQJ IURP GDWD ,Q 3URFHHGLQJV RI WKH ,QWHUQDWLRQDO &RQIHUHQFH RQ &RPSXWDWLRQDO ,QWHOOLJHQFH IRU 0RGHOLQJ &RQWURO DQG $XWRPDWLRQ &,0&$ 9LHQQD >@ +VX : + DQG -RHKDQHV 5 3HUPXWDWLRQ JHQHWLFDOJRULWKPVIRUVFRUHEDVHG%D\HVLDQQHWZRUN VWUXFWXUHOHDUQLQJ,Q3URFHHGLQJVRIWKH,QWHUQDWLRQDO &RQIHUHQFH RQ &RPSXWLQJ &RPPXQLFDWLRQV DQG &RQWURO 7HFKQRORJLHV &&&7 $XVWLQ 7; >@ +VX : + *HQHWLF ZUDSSHUV IRU IHDWXUH VHOHFWLRQ LQ GHFLVLRQ WUHH LQGXFWLRQ DQG YDULDEOH RUGHULQJ LQ %D\HVLDQ QHWZRUN VWUXFWXUH OHDUQLQJ ,QIRUPDWLRQ6FLHQFHVSS >@ /DUUDQDJD 3 .XLMSHUV & 0 0XUJD 5 + DQG @ 6SLUWHV 3 DQG 0HHN & /HDUQLQJ %D\HVLDQ QHWZRUNV ZLWK GLVFUHWH YDULDEOHV IURP GDWD .'' SS >@ 6SLUWHV 3 *O\PRXU & DQG 6FKHLQHV 5 &DXVDWLRQ SUHGLFWLRQ DQG VHDUFK 6SULQJHU 9HUODJ %HUOLQ
2007 IEEE Congress on Evolutionary Computation (CEC 2007)
1449