CERIAS Tech Report 2004-12 DETECTION OF SETS OF EPISODES IN EVENT SEQUENCES: ALGORITHMS, ANALYSIS AND EXPERIMENTS by Robert Gwadera, Mikhail Atallah, Wojciech Szpankowski Center for Education and Research in Information Assurance and Security, Purdue University, West Lafayette, IN 47907-2086
Detection of Sets of Episodes in Event Sequences: Algorithms, Analysis and Experiments Mikhail Atallah
Robert Gwadera
Wojciech Szpankowski
Department of Computer Sciences Purdue University West Lafayette, Indiana
Department of Computer Sciences Purdue University West Lafayette, Indiana
Department of Computer Sciences Purdue University West Lafayette, Indiana
[email protected]
[email protected]
[email protected]
ABSTRACT
= > ; - ?
! " Æ
# $ % Æ & '!
( ! !
( ) "
Categories and Subject Descriptors * " + , - ../0121 3/0 1021 3/ .14/ 3/0 015 3 / 2 2 6 3 7 8 6/// 2/ /042 9Æ 6 : 8 , : - 3 * ; * "< 8
General Terms -
1. INTRODUCTION ; @ 5 2A ( ' @ A " -
"
% >
-
3 B " " 3
%
C
%
" 637 + 88:/ /=5/. " 637 + 88:/ /=5/. 6 :/ +)/4=.1./ -
7
"
%
" C D
@
A
3
"
@ C
A
' % >
@A
- &
@A
@
A
'! &
@ A "
- ! '
@ A
'
@
F
A %
> @A G @A
! %
(
& C
@ A
'
4 ) ' E
@A % @A
'
)
'
F
>
F
4
@ A F % '
%
% "
'!
C
H '
% '!
(
C
(
@BA
&
) "
' %
9 ' I @
@
I @
"
A
B
A "
%
A % "
@
! ) "
A
!
%
A
' I @
A
9
( ! I @
C I @
A
@
%
A
%
@A
@
!
A )
% " 4 % # $ % Æ ! ' > , Æ
@ A
%
@A
" @
A
( C
%
@ A +
4 ) '
4
@ A
" 4
C
F
F
F
G
+
0
J 9
C
( ! ! ( ( ) ... /// ( ' ! ! % ' 7 7 ! 7 4 C E
7
2 1 =
7
@ A
!
7 E
= B ! " # $ % !"#% ' C
F
* !
/ 0 2
A
' ' I @
% C
3 0 ! *
2.
A /
F
C
F
-
"
&'
( I @
%
A I @ A
I @ @
A (
'
@ A '!
( !
"
A "
"
A
I @
'
'
@A
@
%
F / A '
A @A
I @
- C (
@A
@BA
@ A F
( C I @
( '
A
I @
A
I @
'
B
C
'
A
( I @
8
I @ A
G
A ' A
( C I @
>
F F
MAIN RESULTS
+
%
9 "
.
K 3 &
!
5
B I @
% .
% '
@A
F
@A
C
@ A
,
7 "
F
% '
C
@ A @ A @ A @
A % @
2.1 Analysis of @A
@ A
D
@ A
@ A
@ A
F
@ A
F
@/ /A
%
A L F L
A @ A @ L A L !
@
@ L
A L F L
L A @ A @ L A L !
@
@
@ A
' B
@ @ A A
@ A L F L @ L L A @ A @ L L A L @F L A !
- ! @
@ A
@ A @ A (
@ A F '
A / '
@
@ @ A A
/
'
% !
% >
@A
@ AA
@ A F @ A
@ A ! @ A 7
@ A
@ A>
A
@A
F
@ A F
@ A @ AL
@ A @ AL @ @ A @ AA @ A / /
L
F
!
F @A L
% ! '
@ A
F
@ A F
@ A @ AL @ @ AA @ A / /
% ! @/ /A C
% ! @
@ / /A F
@/ A F /
@ /A F
@ / A F
@/ / /A F
F @ A
C
@A @A ! @A ! @ D
@ A
A
A
F > @ A @
A @ A @ A @ @ A
@ AA @ A F 7
@A
F @ A
@
A
L
%
!
/ / / /
G
!
@A
@ A G
@A
C
D
D
!
@ A !
! D
@ @AA
F
@ @ AA
6
!
@ A %
M %
@ A ,!
Æ
@A >
F
@ A 7 7
@ A D
@ A
@ A @ NA
@ A
%
!
& 7
@ A @ L A "
& 6
C @ ' A !
@ A
F @ A
F !"
F / , ! # @ A F !
F /
C
E
" @#@ AA L
" @# @/ / /AA F /
%
! !
%
" @#@ AA
L
" @# @ AA F # @ A
! !
@ A
F !
/ 6 ! !
, ! " ! ! %
( @ A F )
@ @AA F
@ @ "
AA
F @
D
*
@ A F
@
@ L L AA
! Æ %
0 % "
@A F @ A
A
I @
(
F @
A
A
! ! ! D
!
@ @AA
@ A @ A @ A ,! @ A
! !
7 0 7 2 % ! %
( @ A
@ A F
@ A
@ A
$ @ A F
F
L
F !
F ! "
F
F
*
2.2 Efficient algorithm for detection of a parallel episode
F ! "
F !!
@ A
@ A F
@ A
D
F F
& D ! F / ( 7 D
1
" "
, >
"
" ' > ! Æ %
@A
( ) Æ
@ @AA
(
@ A
6
> 7
$ @ A
@ A
%
$ @ A
>
@A
( ) %
@ @AA
>
@ A
0
"
%
>
"
" >
"
%
3 C
0 *
" "
D
!
L F /
F !
3
%
@ A
A F @ A @ @
A
9
"I @
% >
A '
' / A I @ A
I @
>
'
'
"
9 @
A
!>
L
% "
% "
@ A
A @ A
% ! ' I @
@ A
%
C
F L
!
%" @A %
0
%
2 * I @ A ( ) * $ F @ A
A I @ A "I @ A
I @
F
(
A @A
M ' ' I @
@A @A ) @ A
!
@ A % 51 I @ A
F F F
) @ A
@A L "
@A F ) @ A 9
%
@ /A '
%
3. EXPERIMENTS
2.3 Computation of threshold @A
% !
6
E
I @
% ( )
A
9
A F
I @
9 ;
&
8 3 * M
... /// 01 (
& F
01 ! .44
/
'
@ F A %
&
F
"&
F
@ A
@A @ @AA
I @
A
@ A
'
&
%
( )
&
F % !
F
"& L #$& & F "& L
@ L A
@ A @ @AA L L @ A
@ A @ @AA
& !
G
!" # !" $ %& # '" %( # !" )& & & *" + ) ",
"I @ A F
% C
F 01 (
%
9
"
∃
C
P (w) for a parallel episode S 1
" ;
0.9
! %
!
A @A F
@ A & ( " @ A @ A " F //O
@ A
I @
C
0
@A
0.8
0
@A " ( )
0.7
∃
probability of existance: P (w)
0.6 ∃
Pe(w) (estimated)
0.5
∃
P (w) (computed)
0.4
0.3
0.2
0.1
-
0
0
20
40
60
80 100 window size: w
! ' .14 // - 8LL D!
3.1 Estimation of @A and I @ A
!
F /
F / 1 / =/ A@ / A @ / A % @ A I @ A@ / A
@ A @ A %
% % 7 4 ! ' &
"
O
9
-
! ) "
( I @
I @
A 7 5
A
F
! F F
% H
!
9
A % @A A @A @ A % ' I@ / I@ /
'
@A @A " 0O " F OA %
@
( I @
I @
A 7 .
A
3.1.3 Comparison of the three cases: parallel, serial and serial !
@A ! @A > @
A 7 '
! 7
7
@ A
=/ A@ / A
F / 1 /
! I @
E
" % C %
@A
7 /
3.1.2 The case of a set of two serial episodes
' I @
@A
180
% @ A F @ A && & ' (&)*
F
160
!
@A
140
3.1.1 The case a parallel episode
120
% 7 =
7
% '
! C
B
@A F *@+ A
+
3.2 Threshold @A % !
@A
@A
(
F
(
F 1////
Ω∃(105, w) and E[Ω∃(105, w)] for a parallel episode S
4
x 10
P∃(w) for a set of two serial episodes S , S 1
0.9
8
0.8
7
0.7
probability of existance: P∃(w)
9
∃
6 ∃
5
Ω (10 , w) (actual)
5
E[Ω∃(105, w)] (computed)
4
P∃(w) (estimated)
0.2
1
0.1
20
40
60
80 100 window size: w
120
140
160
180
@A F @A L
A
"I @
'
"
!
"I @ A
F
@I @ A
I @ AA
/
/ (
"
0
20
40
60
80 100 window size: w
120
140
160
180
, @ A F @ A F & ' (&)* - ! " @ A @ A , @ A D @ A @ A F %
>
7 (
" @A
@ A
F
@ @A A @ @A @A A
@ @A @A A
(
! ( C
0
' ) "
1//// F 2/ 0/ 1
P (w) (computed)
0.4
2
0
e ∃
0.5
0.3
+ I @ A I @ A && & ' (&)*
0.6
3
0
2
1
5
number of windows: Ω (10 , w)
10
@A @A ! @ @A @A A
C
) "
" " 7
@ A
@ A
5. REFERENCES ) ; - (
-
@A
!
!
@A @A
; ) 6 7!
( # ///
- - ) - 8 C
4.
CONCLUSIONS AND EXTENSIONS
( ! !
@A
0 * B
7 !
Æ
2 D + ; + + ; : 7 P KQ
""Q
,
# ..5
( (
@ @AA !