VoIP Classification - IJIET

3 downloads 0 Views 487KB Size Report
Aug 2, 2016 - Google Talk, Google+ Hangout, Asterisk etc. These VoIP applications generate a huge amount of network traffic. And these VoIP traffic need a ...
International Journal of Innovations in Engineering and Technology (IJIET)

8Q+2%NCUUKHKECVKQP ,UHQJEDP7LORNFKDQ6LQJK Department of Computer Science Manipur University, Imphal, Manipur, India

7HMPDQL6LQDP Department of Computer Science Manipur University, Imphal, Manipur, India

7KRXQDRMDP5XSDFKDQGUD6LQJK Department of Computer Science Manipur University, Imphal, Manipur, India  $EVWUDFW   +XPDQ FRPPXQLFDWLRQ WHFKQRORJ\ KDV FKDQJHG G\QDPLFDOO\ ZLWK WLPH :LWK HPHUJLQJ WUHQGV DQG WHFKQRORJLHV XVHUV QRZDGD\V DUH VKLIWLQJ IURP WUDGLWLRQDO SKRQH FDOO WR 9R,3 9RLFH RYHU ,3  DSSOLFDWLRQV YL] Skype, Google Talk, Google+ Hangout, Asterisk HWF 7KHVH 9R,3 DSSOLFDWLRQV JHQHUDWH D KXJH DPRXQW RI QHWZRUN WUDIILF $QG WKHVH 9R,3 WUDIILF QHHG D SURSHU FODVVLILFDWLRQ E\ *RYHUQPHQW DJHQFLHV GXH WR VHFXULW\ UHDVRQ RU E\ ,63 RU 1HWZRUN RSHUDWRUV IRU ELOOLQJ DSSOLFDWLRQ VSHFLILF WUDIILF RU HYHQ E\ WKH 1HWZRUN $GPLQLVWUDWRU RI DQ ,QVWLWXWLRQ WR LPSOHPHQWV 4R6 RU WR PRQLWRU WKHLU QHWZRUN 7KXV 1HWZRUN WUDIILF FODVVLILFDWLRQ SOD\V DQ LPSRUWDQW UROH LQ WKH DUHDV RI QHWZRUN VHFXULW\QHWZRUNPRQLWRULQJ4R6DQGWUDIILFHQJLQHHULQJ,QWKLVSDSHUZHSURSRVHDQRYHODSSURDFKWRLGHQWLI\9R,3 1HWZRUN7UDIILFLQWKHILUVWIHZVHFRQGVRILQLWLDOVWDWHRIFRPPXQLFDWLRQ7KHSURSRVHGFODVVLILHUZRUNVZLWK0DFKLQH /HDUQLQJ 7HFKQLTXHV EDVHG RQ WKH VWDWLVWLFDO IHDWXUHV 7KH H[SHULPHQWDO UHVXOWV VKRZ WKDW WKH SURSRVHG PHWKRG FDQ DFKLHYHRYHUDFFXUDF\IRUDOOWHVWLQJGDWDVHW

 .H\ZRUGV±9R,37UDIILFLGHQWLILFDWLRQ1HWZRUN7UDIILF&ODVVLILFDWLRQ0DFKLQH/HDUQLQJ

,,1752'8&7,21 ,Q JHQHUDO QHWZRUN WUDIILF FODVVLILFDWLRQ LV D IXQGDPHQWDO SURFHVV WR FODVVLI\ WKH QHWZRUN WUDIILF DQG LGHQWLI\ WKH FRUUHVSRQGLQJDSSOLFDWLRQVLQPRGHUQQHWZRUNVHFXULW\V\VWHPVQHWZRUNPRQLWRULQJ4R6DQGWUDIILFHQJLQHHULQJ 7UDGLWLRQDOPHWKRGRIWUDIILFFODVVLILFDWLRQDUHGRQHEDVHGRQWKHDSSOLFDWLRQSRUWPDSSLQJZKLFKDUHDVVLJQHGE\ ,$1$ Internet Assigned Numbers Authority SURWRFROIRUPDWDQDO\VLVDQGSD\ORDGEDVHGPDWFKLQJDSSURDFK%XW WRGD\ HPHUJLQJ DSSOLFDWLRQVXVHV HSKHPHUDO G\QDPLF DQG UDQGRP SRUWVDQG HQFU\SWHGSD\ORDGV IRU REIXVFDWLRQ 6R WKH WUDGLWLRQDO PHWKRGV RI WUDIILF FODVVLILFDWLRQ port based prediction DQG payload based deep inspection method >@±>@DUHQRORQJHUHIIHFWLYHDQGHIILFLHQW0RVWUHVHDUFKHUVDUHGLYHUWLQJDZD\IURPWKHVHROGWHFKQLTXHV RIFODVVLILFDWLRQDQGDUHDGRSWLQJWKHVWDWLVWLFDOEDVHGFODVVLILFDWLRQWHFKQLTXHV  6HYHUDO VLJQLILFDQW VWXGLHV KDYH SUHYLRXVO\ EHHQ FDUULHG RXW RQ WUDIILF FODVVLILFDWLRQ EDVHG RQ 0DFKLQH /HDUQLQJ 0/  >@±>@ 6RPH DUH IRFXVHG RQ FOXVWHULQJ WHFKQLTXHV ZKLFK DUH XQVXSHUYLVHG 0DFKLQH /HDUQLQJ DOJRULWKPV DQG VRPH DUH EDVHG RQ VXSHUYLVHG 0DFKLQH /HDUQLQJ PHWKRG ZKLFK GHDOV ZLWK WUDLQLQJ WKH FODVVLILHU ZLWK NQRZQ GDWDVHWV 7KH PHWKRG SURSRVHG LQ WKLV SDSHU LV D K\EULG DSSURDFK EDVHG RQ WKH FRPELQDWLRQ RI ERWK XQVXSHUYLVHGDQGVXSHUYLVHGPHWKRGV  :LWK HPHUJLQJ WUHQGV DQG WHFKQRORJLHV XVHUV QRZDGD\V DUH VKLIWLQJ IURP WUDGLWLRQDO SKRQH FDOO WR 9R,3 DSSOLFDWLRQV 7KHVH DSSOLFDWLRQV DUH PRVWO\ HQFU\SWHG VRPH OLNH 6N\SH XVHV 33 DUFKLWHFWXUHV DQG KDYH WKH FDSDELOLW\ WR WUDYHUVH DQ\ QHWZRUN FRQGLWLRQV 6R WKHUH¶V D ORW RI LQWHUHVW DPRQJ UHVHDUFK FRPPXQLW\ QHWZRUN RSHUDWRUVDQGHYHQ*RYHUQPHQWDJHQFLHVLQLGHQWLI\LQJWKHVHDSSOLFDWLRQV6RPHRI9R,3DSSOLFDWLRQVWKDWZHKDYH FRQVLGHUHGLQRXUVWXG\DUHSkype, Gtalk, Asterisk DQG Google+ Hangouts.   ,QWKHSURSRVHGPHWKRGPHGLDWUDIILFIORZVRIDSDUWLFXODUDSSOLFDWLRQDUHJDWKHUHGILUVW7KHVHIORZVDUH IXUWKHU VSOLW LQWR sub-flows XVLQJ VOLGLQJ ZLQGRZV 7KH WHUP sub-flow LV GHILQHG DV VXEVHWV RI D IORZ KDYLQJ WKH VDPHWXSOH src ip, dst ip, src port, dst port DQG protocol ZLWKWLPHEDVHGZLQGRZVVL]H DQGVHFRQGV  DQGDUHREWDLQHGE\VOLGLQJZLQGRZV7KHVHZLQGRZVDUHRYHUODSSLQJ/HWXVFRQVLGHUKRZWRWRREWDLQVXEIORZ ZLQGRZVL]HRI5VHFRQGWKH1stZLQGRZVWDUWIURP0VHFRQGWR5VHFRQGWKH2ndZLQGRZVWDUWIURP1VHFRQGWR6

Volume 7 Issue 2 August 2016



426



ISSN: 2319 – 1058

International Journal of Innovations in Engineering and Technology (IJIET)

VHFRQGDQGkthZLQGRZVWDUWIRPk-1WRk+4VHFRQGWKXVWKHVHZLQGRZDUHVOLGLQJZLWK1VHFRQGDQGRYHUODSSLQJ ZLWK4VHFRQG7KHUHDVRQVIRUFRQVLGHULQJVXEIORZVDUH 1) for early classification of VoIP media Traffic DQG 2) to enable the classifier to work in real time or on- the-fly. The security, monitoring and management systems need the prior information, but not the post-mortem report.   )URP WKHVH sub-flows ZH H[WUDFW WKH VWDWLVWLFDO IORZ IHDWXUHV 7KHVH IORZ IHDWXUHV FRQVLVWLQJ QXPEHU RI SDFNHW SDFNHW VL]H PLQLPXP SDFNHW VL]H PD[LPXP SDFNHW VL]HWKH ILUVW DQG VHFRQG RUGHU VWDWLVWLFV RYHU SDFNHW VL]HDQGSDFNHWLQWHUDUULYDO minimum, maximum, average and standard deviation DUHREWDLQHGXVLQJRYHUODSSLQJ VOLGLQJZLQGRZ  7KHWUDFHVXVHGLQRXUVWXG\ZHUHFDSWXUHGRQWKHFOLHQWVLGHDQGDWWKHHGJHRIRXU8QLYHUVLW\1HWZRUN GXULQJ7KHJURXQGWUXWKWUDFHVDUHFDWHJRUL]HGLQWRWZRVHW trainingDQGtesting )URPWKHVHWZRVHW ZH H[WUDFWHG WKH HDWXUH GDWDVHWV IRU HDFK DSSOLFDWLRQ $QG kPHDQ FOXVWHULQJ ZDV SHUIRUPHG RQ HDFK DSSOLFDWLRQ GDWDVHWV WR JURXS WKH WUDLQLQJ VHW IRU HDFK DSSOLFDWLRQ 7KH YDOXH RI k LV GHWHUPLQHG IURP WKH UHVXOW RI '%6&$1 FOXVWHULQJ/DVWO\ZHEDODQFHWUDLQLQJDQGWHVWLQJVHWE\DFTXLULQJWKHGDWDSURSRUWLRQDOLW\RIHDFKFOXVWHURIHDFK DSSOLFDWLRQ  ,QRXUVWXG\ZHXVHIRXUVXSHUYLVHG0DFKLQH/HDUQLQJFODVVLILFDWLRQDOJRULWKPYL]Decision Tree (C4.5), Naive Bayes, Bayesian Belief DQG SVM>@±>@)LUVWRIDOOZHEXLOGGLIIHUHQWFODVVLILHUPRGHOEDVHGRQsub-flow VWDWLVWLFRIDWWULEXWHVIHDWXUHGHULYHGE\YDU\LQJZLQGRZVL]HRIDQGVHFRQGV$QDO\VLQJWKHFODVVLILHU PRGHOVZLWKGHIHUULQJZLQGRZVZHVHOHFW&ZLWKVHFRQGZLQGRZIRUIXUWKHUDQDO\VLVDVWKLVPRGHOJLYHVEHWWHU SHUIRUPDQFHWKDQRWKHUPRGHOV)XUWKHUVLQJOHFODVVFODVVLILHUPRGHOVDUHDOVRDQDO\VHGZLWKWKHVHFRQGZLQGRZ EDVHG&FODVVLILHUPRGHO$QGIURPWKHUHVXOWLWLVIRXQGWKDWWKHDWWULEXWHVHWKDVWKHSRWHQWLDORIDFTXLULQJWKH FKDUDFWHULVWLFVRIDSSOLFDWLRQV:HDOVRDSSO\VHOHFWHGDWWULEXWHPRGHOWRWHVW7VWDWWUDFH>@>@



 )LJXUH'DWD&ROOHFWLRQ$UFKLWHFWXUH

 7KH UHVW RI WKLV SDSHU LV RUJDQL]HG DV IROORZV 6HFWLRQ  FRYHUV WKH UHODWHG ZRUN ZLWK PRUH HPSKDVLV RQ VWDWLVWLFVEDVHG,QWHUQHWWUDIILFFODVVLILFDWLRQDSSURDFKHV6HFWLRQRXWOLQHVWKHGDWDXVHGDQGKRZWKH\DUHFROOHFWHG 6HFWLRQ  GHVFULEHV WKH PHWKRGV WKDW DUH SURSRVHG 6HFWLRQ  SURYLGHV D GHWDLOHG DQDO\VLV RQ RXU FODVVLILFDWLRQ DSSURDFK DQG LWV SHUIRUPDQFH HYDOXDWLRQ UHJDUGLQJ WKH H[SHULPHQWDO WUDIILF FODVVLILFDWLRQ $QG GLVFXVV WKH SHUIRUPDQFH PHDVXUHV 6HFWLRQ  FRQFOXGHV WKH SDSHU ZLWK VRPH ILQDO UHPDUNV DQG VXJJHVWLRQV RI SRVVLEOH IXWXUH ZRUN ,,5(/$7(':25. ,Q WKH HDUO\ GD\V FODVVLILFDWLRQ RI QHWZRUN WUDIILF ZHUH REWDLQHG WKURXJK WKH EDVH NQRZOHGJH RI ,$1$ DVVLJQHG UHVHUYHGSRUWQXPEHUV'XHWRWKHDGYHUWHQWDWWHPSWWRE\SDVVWUDIILFXVLQJHSKHPHUDOSRUWVE\QHZHUDSSOLFDWLRQV VXFKDVWKRVHXVLQJ33KDVUHQGHUHGSRUWEDVHGFODVVLILFDWLRQLQHIIHFWLYH7KLVZDVFRQILUPHGE\.DUDJLDQQLVHWDO >@ E\ LGHQWLI\LQJ 33 RQ KDQGFUDIWHG VLJQDWXUHV +DIIQHU HWDO >@ DXWRPDWHG WKH FRQVWUXFWLRQ RI DSSOLFDWLRQ VLJQDWXUHVRQWUDLQHGVHWVE\HPSOR\LQJVXSHUYLVHGPDFKLQHOHDUQLQJWHFKQLTXHV-HIIUH\(UPDQHWDO>@VKRZHG WKHSHUIRUPDQFHRINPHDQVFOXVWHULQJEHWWHUWKDQ'%6&$1DQG(0FOXVWHULQJDOJRULWKPV-HIIUH\(UPDQHWDO>@ SURSRVHGDVHPLVXSHUYLVHGOHDUQLQJXWLOL]LQJNPHDQVFOXVWHUV7KLVVWXG\ZDVEDVHGRQVWDWLVWLFVRIWKHIORZ7KH NPHDQVFOXVWHULQJKDVWKHGUDZEDFNRIDVVLJQLQJWKHQXPEHURIFOXVWHUDQGWKHQXPEHURIFOXVWHUWREHIRUPHGFDQ QRW EH SUHGLFWHG 6R @ HPSOR\HG ;PHDQV FOXVWHULQJ WR WKHLU ZRUN $OWKRXJK ;PHDQV LV EDVLFDOO\HTXLYDOHQWWRNPHDQVLWGRHVQRWUHTXLUHWKHDVVLJQPHQWRIWKHQXPEHURIFOXVWHUVLQDGYDQFH;LDQJ/L HWDO >@ DSSOLHG 6XSSRUW 9HFWRU 0DFKLQH OHDUQLQJ EDVHG RQ IORZ VWDWLVWLFV WR LGHQWLI\ DQG FODVVLI\ QHWZRUN DSSOLFDWLRQV 1RZDGD\V UHVHDUFKHUV DUH PRUH RU OHVV DWWUDFWHG WRZDUGV VWDWLVWLFDO EDVHG DSSURDFK DV LW GRHV QRW

Volume 7 Issue 2 August 2016

427

ISSN: 2319 – 1058

International Journal of Innovations in Engineering and Technology (IJIET)

LQYROYHVSDFNHW SD\ORDG GDWD 0DQ\ UHVHDUFKHUV DWWHPSWHG WR EXLOG VWDWLVWLFDO FODVVLILHU PRGHOV EDVHG RQ IXOO IORZ IHDWXUH+RZHYHUIXOOIORZIHDWXUHVDSSURDFKHVH[KLELWVORZHUSHUIRUPDQFH5HFHQWO\UHVHDUFKHUVKDYHLQLWLDWHGWKH XVHRI VWDWLVWLFDO DSSURDFKEDVHGRQVXEIORZ IHDWXUHVHWV>@>@±>@7KHSDFNHW VL]H DQG LQWHUDUULYDO WLPH DUH PRUHHIIHFWLYHPHDVXUDEOHIHDWXUHVLQHDUO\FODVVLILFDWLRQRIQHWZRUNIORZVDVVKRZQLQ>@>@>@

 )LJXUH7HVWEHG/DERUDWRU\

,,,'$7$&2//(&7,21 1HWZRUNWUDFHVDUHFROOHFWHGIURPRXUWHVWEHGDWWKHHGJHRIRXU8QLYHUVLW\1HWZRUN )LJXUH DQGIURPSXEOLFO\ DYDLODEOHWUDFHVRITstat>@>@7KHWHVWEHGLVVHWXSDW1HWZRUN6HFXULW\/DE08 Manipur University  )LJXUH   ZKHUH YDULRXV 9R,3 DSSOLFDWLRQ WUDFHV DUH JHQHUDWHG $QG XVLQJ JW¶V >@ PHWKRG ZH FROOHFW JURXQG WUXWK DSSOLFDWLRQWUDFHV$1DSDWHFKGDWDFDSWXUHFDUG17(67'>@ZDVXVHGWRFDSWXUHWUDFHVRQRXUORJVHUYHUDWWKH HGJH RI RXU 8QLYHUVLW\ 1HWZRUN Figure 1  $ 9R,3 VHUYHU LV DOVR UXQQLQJ DW WKH SXEOLF GRPDLQ ZKHUH WUDFHV RI $VWHULVNEDVHG9R,3DSSOLFDWLRQVDUHFROOHFWHG:HFROOHFWHGYDULRXVW\SHVRI6N\SHDQGQRQ6N\SHWUDFHVVXFKDV YRLFHYLGHRVLOHQFHFDOOFDOOZLWKLQ/$1DQG:$1HWF  'DWD ZHUH JHQHUDWHG XVLQJ WKH 9R,3 FOLHQWV VXFK DV 6N\SH %HWD  YHUVLRQ  OLQSKRQH  :LQGRZV   OLQSKRQH  /LQX[ PLQW   VLSGURLG  EHWD (NLJD 6RIWSKRQH  &;3KRQH  :LQGRZV *WDONLQ*RRJOH&KURPHY*WDONLQ*RRJOH&KURPHY*WDONZLWK(PSDWK\  *RRJOH +DQJRXW DQG $VWHULVN  EHWD XVLQJ $QGURLG  ,&6  $QGURLG  -HOO\ %HDQ  :LQGRZVOLQX[PLQWDQG8EXQWX  )RUWKHH[SHULPHQWZHZHUHDEOHWRFROOHFW§*%RI9R,3WUDFHVLQFOXGLQJ6N\SH¶V*%VSUHDG RYHUPRQWKV$QGDQRWKHU*%RI6N\SH¶VDQRQ\PL]HGWUDFHVZDVREWDLQHGIURP7VWDW:HRQO\GRZQORDGHG HQGWRHQG6N\SH8'3WUDIILFIURP7VWDW)URPWKHVHDQRQ\PL]HGWUDFHVZHH[WUDFWHGDERXW§*%RISDFNHWVL]H DVGHULYHGIURPWKHKHDGHU,3OHQJWKV



)LJXUH$SSOLFDWLRQ)HDWXUH'DWDVHWVH[WUDFWLRQWRJDWKHUWUDLQLQJDQGWHVWLQJGDWDVHWVSURSRUWLRQDOLW\VDPSOLQJE\XVLQJFOXVWHULQJ

Volume 7 Issue 2 August 2016

428

ISSN: 2319 – 1058

International Journal of Innovations in Engineering and Technology (IJIET)

,90(7+2/2*@±>@ HYHQ ZKHQ IHZ EHJLQQLQJ SDFNHWV DUH ORVW 2XUPRGHO DLPV WR FODVVLI\ WKH WUDIILFEDVHG RQ IHZ SDFNHWV:HPDGHXVHRIPHGLDWUDIILFJHQHUDWHGE\9R,3DSSOLFDWLRQVYL]Skype, Gtalk, Hangout DQG Asterisk.:H H[WUDFWHGWKHVWDWLVWLFDOsub-flowIHDWXUHVRIWKHPHGLDWUDIILFXVLQJRYHUODSSLQJVOLGLQJZLQGRZVDVVKRZQLQ)LJXUH )URPWKHWUDLQLQJGDWDVHWVRI)LJXUHEXLOGWKHVXSHUYLVHG0DFKLQH/HDUQLQJ&ODVVLILHUPRGHO)LJXUHVDQG VKRZVWKHSURSRVHGPDFKLQHOHDUQLQJWUDIILFFODVVLILFDWLRQPRGHOEXLOGLQJDQGWKHFODVVLILHUZKLFKFODVVLI\ERWKWKH RIIOLQH GDWDVHWV DQG UHDO WLPH WUDIILF FODVVLILHU ZLWK VXSHUYLVHG 0DFKLQH /HDUQLQJ 2XU V\VWHP JRHV WKURXJK WKH IROORZLQJVWDJHV 1) Training and testing datasets preparation, 2) Model building DQG 3) Experiments with Various Traffic Classifier.





)LJXUH7UDLQLQJWKH0DFKLQH/HDUQLQJWUDIILFFODVVLILHUDQGFODVVLILFDWLRQRIRIIOLQHGDWDVHWV

)LJXUH5HDO7LPH7UDIILFFODVVLILHUZLWKVXSHUYLVHG0DFKLQH/HDUQLQJ



4.1. Training and Testing Dataset Preparation :H FRQVLGHU RQO\ WKH JURXQG WUXWK 9R,3 PHGLD IORZ WUDIILF $SSOLFDWLRQ EDVHG )HDWXUH 'DWDVHW H[WUDFWLRQ LV SHUIRUPHG WR JDWKHU WKH VWDWLVWLFDO LQIRUPDWLRQ IRU WUDLQLQJ DQG WHVWLQJ GDWDVHWV 2XU SURSRVHG V\VWHP H[WUDFWV WKH sub-flow VWDWLVWLFDO IHDWXUH RI PHGLD WUDIILF 7KH sub-flow LQIRUPDWLRQ LV H[WUDFWHG IURP PHGLD IORZ XVLQJ RYHUODS VOLGLQJZLQGRZVFRQFHSW7KHVWDWLVWLFDOIHDWXUHVFRQVLGHUHGLQRXUVWXG\DUHJLYHQLQWDEOH,7KHVHVWDWLVWLFDOsubflow IHDWXUHV RI D VOLGLQJ ZLQGRZ FRQVLVWV RI SDFNHW FRXQWHU SDFNHW VL]H SDFNHW LQWHUDUULYDO  WLPH DQG WKHLU GHULYDWLYH IHDWXUHV ILUVW RUGHU VWDWLVWLFV (minimum, maximum and average) DQG VHFRQG RUGHU VWDWLVWLF (standard deviation)  %DVHG RQ SDFNHW VL]H WKH DWWULEXWHV DUH IXUWKHU FDWHJRUL]HG LQWR ORZ DQG KLJK WR UHSUHVHQW YRLFH DQG YLGHRWUDIILFUHVSHFWLYHO\:HWDNHWKHVHH[WUDFWHGDQGGHULYHGVWDWLVWLFDOIHDWXUHVRIWKHPHGLDWUDIILFVXEIORZDQG IURP WKDW ZH EXLOG WKH FODVVLILHU VLJQDWXUH PRGHO )LJXUH  VKRZV WKH ELJ SLFWXUH WR H[WUDFW IHDWXUH GDWDVHW IRU WUDLQLQJDQGWHVWLQJ8VLQJNPHDQFOXVWHULQJZHJURXSVLPLODUGDWDSRLQWVLQWRFOXVWHUVIRUHDFKDSSOLFDWLRQVVRWKDW VDPSOLQJRIWKHWUDLQLQJGDWDSRLQWVLQFRUSRUDWHWKHGDWDSURSRUWLRQDOLW\RIWKHDSSOLFDWLRQV¶GLYHUJHQWFKDUDFWHULVWLFV

Volume 7 Issue 2 August 2016

429

ISSN: 2319 – 1058

International Journal of Innovations in Engineering and Technology (IJIET)

LQGXFHGE\WKHXVHRIGLIIHUHQWFRGHFIRUGLIIHUHQWDSSOLFDWLRQV7KHQXPEHURIFOXVWHU(k)WREHIRUPHGLVREWDLQHG IURPWKHUHVXOWRI'%6&$1FOXVWHULQJ7KHQVDPSOLQJRIWKHWUDLQLQJGDWDSRLQWVDUHFDUULHGRXWWKURXJK:(.$ VDPSOLQJ SURFHGXUH IURP WKH FOXVWHUHG GDWD SRLQWV RI WKH DSSOLFDWLRQ LQFRUSRUDWLQJ WKH GDWD SURSRUWLRQDOLW\ RI WKH DSSOLFDWLRQ )RU HDFK RI WKH DSSOLFDWLRQ D VDPSOH RI  WXSOHV LV UDQGRPO\ VHOHFWHG IURP WKH RULJLQDO GDWD 6R DOWRJHWKHUZHXVHWXSOHVZKLFKLVXVHGWREXLOGWKHPRGHO7KHWUDLQLQJGDWDVHWLVVKRZQLQWDEOH,,6LPLODUO\ WHVWLQJ GDWD SRLQWV DUH VHOHFWHG IURP WKH VHSDUDWH WHVWLQJ WUDFH 6R DOWRJHWKHU WKH WHVWLQJ GDWD SRLQWV FRQVLVWV RI GDWDSRLQWVFRQWULEXWHGE\WKHWKUHHDSSOLFDWLRQ 4.2. Model Building 4.2.1. Determination of best classifier model and ?-second windows: :H EXLOG YDULRXV FODVVLILHU EDVHG RQ WKH IROORZLQJPDFKLQHOHDUQLQJDOJRULWKPV NB (Naive Bayes), BBN (Bayesian Belief Network), C4.5 DQG SVM (Support Vector Machine),PSOHPHQWDWLRQRIDOOWKHVHPDFKLQHOHDUQLQJDOJRULWKPVDUHGRQHXVLQJWEKA>@7KHsub-flow LQIRUPDWLRQDUHRYHUODSSLQJZLQGRZVVOLGLQJWKURXJKWKHPHGLDIORZ7KHsub-flowSHUIRUPDQFHDUHFRPSDUHGZLWK GLIIHUHQWVOLGLQJZLQGRZVVL]H7DEOH,,,VKRZVUHVXOWRIWHVWLQJWKHFODVVLILHUPRGHOZLWKZLQGRZVL]HRI DQGVHFRQG )URPWDEOH,,,ZHFDQVHHWKDW&FODVVLILHULVWKHEHVWDPRQJWKH0/FODVVLILHUVXVHGLQRXUH[SHULPHQWV & FODVVLILHU PRGHO RQ VHFRQG VOLGLQJ ZLQGRZ DFKLHYHG WKH KLJKHVW UHVXOW RI  6R & FODVVLILHU DOJRULWKP LV GHWHUPLQHG DV WKH EHVW FODVVLILHU PRGHO ZLWK VHFRQGV ZLQGRZ DQG FKRVHQ IRU IXUWKHU VWXGLHV 7KH SUHFLVLRQDQGUHFDOOYDOXHIRUVHFRQGVZLQGRZEDVHGRQ&FODVVLILHULVVKRZQLQWKHWDEOH,9 7$%/(,'(6&5,37,212)67$7,67,&$/)($785('$7$6(7 1R

)HDWXUH'HVFULSWLRQ

$EEUHYLDWLRQ



7RWDOQXPEHURISDFNHWVLQDZLQGRZ

3BQXP



7RWDOQXPEHURIE\WHVLQDZLQGRZ

7RWDOB3BVL]H



0LQLPXPSDFNHWVL]HLQDZLQGRZ

PLQB3BVL]H



0D[LPXPSDFNHWVL]HLQDZLQGRZ

PD[B3BVL]H



$YHUDJHSDFNHWVL]HLQDZLQGRZ

$YHB3BVL]H



6WDQGDUGSDFNHWVL]HLQDZLQGRZ

VWGB3BVL]H



7RWDOQXPEHURISDFNHWVLQWKHORZFDWHJRU\

ORZ3BQXP



7RWDOQXPEHURIE\WHVLQWKHORZFDWHJRU\SDFNHWV

ORZB7RWDOB3BVL]H



0LQLPXPSDFNHWVL]HLQDZLQGRZLQWKHORZFDWHJRU\SDFNHWV

ORZBPLQB3BVL]H



0D[LPXPSDFNHWVL]HLQDZLQGRZLQWKHORZFDWHJRU\SDFNHWV /RZBPD[B3BVL]H



$YHUDJHSDFNHWVL]HLQDZLQGRZLQWKHORZFDWHJRU\SDFNHWV

ORZB$YHB3BVL]H



6WDQGDUGSDFNHWVL]HLQDZLQGRZLQWKHORZFDWHJRU\SDFNHWV

ORZBVWGB3BVL]H



7RWDOQXPEHURISDFNHWVLQWKHKLJKFDWHJRU\

KLJKB3BQXP



7RWDOQXPEHURIE\WHVLQWKHKLJKFDWHJRU\SDFNHWV

KLJKB7RWDOB3BVL]H



0LQLPXPSDFNHWVL]HLQDZLQGRZLQWKHKLJKFDWHJRU\SDFNHWV KLJKBPLQB3BVL]H



0D[LPXPSDFNHWVL]HLQDZLQGRZLQWKHKLJKFDWHJRU\SDFNHWV KLJKBPD[B3BVL]H



$YHUDJHSDFNHWVL]HLQDZLQGRZLQWKHKLJKFDWHJRU\SDFNHWV

KLJKB$YHB3BVL]H



6WDQGDUGSDFNHWVL]HLQDZLQGRZLQWKHKLJKFDWHJRU\SDFNHWV

KLJKBVWGB3BVL]H



PLQLPXPRIWKHLQWHUDUULYDOLQDZLQGRZ

PLQBWLPH



PD[LPXPRIWKHLQWHUDUULYDOLQDZLQGRZ

PD[BWLPH



$YHUDJHRIWKHLQWHUDUULYDOLQDZLQGRZ

$YHBWLPH



6WDQGDUGGHYLDWLRQRIWKHLQWHUDUULYDOLQDZLQGRZ

VWGBWLPH

   

Volume 7 Issue 2 August 2016

430

ISSN: 2319 – 1058

International Journal of Innovations in Engineering and Technology (IJIET)

7$%/(,,180%(52)'$7$6(76$03/(32,176 0RGHO

'DWDVDPSOHSRLQWV

6N\SHYV *WDON+DQJRXWV$VWHULVN 

 6N\SH  2WKHU 

$VWHULVNYV *WDON+DQJRXWV6N\SH 

 $VWHULVN  2WKHU 

*WDON+DQJRXWYV 6N\SH$VWHULVN 

 *WDON+DQJRXW  2WKHU 

 7$%/(,,,&203$5,6212)3(5)250$1&(0($685(0(17:,7+',))(5(170/$/*25,7+06%$6('21 $775,%87(6:,7+',))(5(17:,1'2:66,=( 08/7,&/$66&/$66,),(56  :LQGRZ

&

%%1

1%

690

VHFRQG









VHFRQG









VHFRQG









VHFRQG









    VHFRQG 7$%/(,935(&,6,21$1'5(&$//9$/8()256(&21':,1'2:%$6('&&/$66,),(5 08/7,&/$66&/$66,),(56  &ODVV

3UHFLVLRQ

5HFDOO

$VWHULVN





6N\SH





*WDON





 4.3 Experiments with Various Traffic Classifier. 4.3.1. Analysis of Single Class Classifier: ([SHULPHQWV ZDV FDUULHG RXW IRU VLQJOH FODVV FODVVLILHU IRU HDFK RI WKH DSSOLFDWLRQ 6kype, Gtalk, Hangouts DQG Asterisk 7KHWUDLQLQJRIWKHPRGHOLVGRQHZLWKIROGFURVVYDOLGDWLRQ,QRUGHUWRSHUIRUPWKHH[SHULPHQWVZH XVHGWUDLQLQJGDWDVHWVDPSOHSRLQWVDVJLYHQLQWDEOH,,7KLVWUDLQLQJGDWDVHWFRQVLVWVRIDWWULEXWHVREWDLQHGXVLQJ VHFRQGVVOLGLQJZLQGRZV)URPWDEOH9ZHFRQFOXGHWKDWWKH&FODVVLILHUSHUIRUPVWKHEHVWJLYLQJPRUHWKDQ DFFXUDF\LQDOOWKHVLQJOHFODVVFODVVLILHU7KHH[SHULPHQWFDUULHGRXWLQWKLVDSSURDFKXVHVDOOWKHIHDWXUHV ZKLFK FRPSULVHV RI WLPHUHOHYDQW DQG WLPHLUUHOHYDQW ,Q WKH QH[W H[SHULPHQW ZH ZLOO GLVFXVV WKH SHUIRUPDQFH PHDVXUHGEDVHGRQWLPHUHOHYDQWDQGLUUHOHYDQWDWWULEXWHVIHDWXUHDQGZLWKGLIIHUHQWIHDWXUHVHOHFWLRQDOJRULWKPV  7$%/(93(5)250$1&($&&85$&@DQG&21 CONsistency based feature selection >@DUHERWKVXEVHWVHOHFWLRQSURFHGXUH EDVHGRQEHVWILUVWVHDUFKPHWKRGVZKHUHDV&+,648$5(IHDWXUHVHOHFWLRQLVEDVHGRQUDQNLQJPHWKRGV>@       7$%/(9,$775,%87(66(/(&7('%@ 6 6HQ 2 6SDWVFKHFN DQG ' :DQJ ³Accurate, scalable in- network identification of p2p traffic using application signatures´ LQ 3URFHHGLQJVRIWKHWK,QWHUQDWLRQDO&RQIHUHQFHRQ:RUOG:LGH:HE1HZ@ 76LQDP,76LQJK3/DPDEDPDQG11'HYL³An efficient technique for detecting skype flows in udp media streams´LQ$GYDQFHG 1HWZRUNVDQG7HOHFRPPXQFDWLRQV6\VWHPV $176 ,(((,QWHUQDWLRQDO&RQIHUHQFH'HFSS± >@ 76LQDP,76LQJK3/DPDEDP11'HYLDQG61DQGL³A technique for classification of voip flows in udp media streams using voip signalling traffic´LQ$GYDQFH&RPSXWLQJ&RQIHUHQFH ,$&& ,(((,QWHUQDWLRQDO)HESS± >@ 7 6LQDP 1 1 'HYL 3 /DPDEDP , 7 6LQJK DQG 6 1DQGL ³Early Detection of VoIP Network Flows based on Sub-Flow Statistical Characteristics of Flows using Machine Learning Techniques´ LQ $GYDQFHG 1HWZRUNV DQG 7HOHFRPPXQFDWLRQV 6\VWHPV $176   ,(((,QWHUQDWLRQDO&RQIHUHQFH'HF >@ /*ULPDXGR00HOOLD(%DUDOLVDQG5.HUDODSXUD³Select: Self- learning classifier for internet traffic´,(((7UDQVDFWLRQVRQ1HWZRUN DQG6HUYLFH0DQDJHPHQWYROQRSS± >@ 771JX\HQDQG*$UPLWDJH³A survey of techniques for internet traffic classification using machine learning´&RPPXQ6XUYH\V7XWV YROQRSS±2FW >@ -&KDQGUDNDQWDQG'/RNKDQGH6KDVKLNDQW³Analysis of early traffic processing and comparison of machine learning algorithms for real time internet traffic identification using statistical approach´LQ$GYDQFHG&RPSXWLQJ1HWZRUNLQJDQG,QIRUPDWLFV9ROXPHVHU6PDUW ,QQRYDWLRQ 6\VWHPV DQG 7HFKQRORJLHV 0 .XPDU .XQGX ' 3 0RKDSDWUD $ .RQDU DQG $ &KDNUDERUW\ (GV 6SULQJHU ,QWHUQDWLRQDO 3XEOLVKLQJYROSS± >@ 5@ - 0 5HGG\ DQG & +RWD ³P2p traffic classification using ensemble learning´ LQ 3URFHHGLQJV RI WKH WK ,%0 &ROODERUDWLYH $FDGHPLD 5HVHDUFK([FKDQJH:RUNVKRSVHU,&$5(¶1HZ@ ³Napatech´KWWSZZZQDSDWHFKFRP >@ ³Weka3.6.2,´KWWSZZZFVZDLNDWRDFQ]POZHND >@ 0 $ +DOO ³Correlation-based feature selection for machine learning´ 'HSDUWPHQW RI &RPSXWHU 6FLHQFH 7KH 8QLYHUVLW\ RI :DLNRWD +DPLOWRQ1HZ=HDODQG7HFK5HS >@ 0'DVKDQG+/LX³Consistency-based search in feature selection´$UWLI,QWHOOYROQRSS± >@ .3HDUVRQ³On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling´3KLORVRSKLFDO0DJD]LQH6HULHV  SS±

Volume 7 Issue 2 August 2016

433

ISSN: 2319 – 1058