Detection of Sets of Episodes in Event Sequences: Algorithms

0 downloads 0 Views 336KB Size Report
Table walmart.Item-Scan. Visit-Nbr. Integer. Item-Nbr. Integer. Item-Quantity. Decimal(9,2). Total-Scan-Amount. Decimal(9,2) transaction-Date. Date Format.
CERIAS Tech Report 2004-12 DETECTION OF SETS OF EPISODES IN EVENT SEQUENCES: ALGORITHMS, ANALYSIS AND EXPERIMENTS by Robert Gwadera, Mikhail Atallah, Wojciech Szpankowski Center for Education and Research in Information Assurance and Security, Purdue University, West Lafayette, IN 47907-2086

Detection of Sets of Episodes in Event Sequences: Algorithms, Analysis and Experiments Mikhail Atallah

Robert Gwadera



Wojciech Szpankowski

Department of Computer Sciences Purdue University West Lafayette, Indiana

Department of Computer Sciences Purdue University West Lafayette, Indiana

Department of Computer Sciences Purdue University West Lafayette, Indiana

[email protected]

[email protected]

[email protected]

ABSTRACT



 =    > ;    -  ?

            

 

                        

       





   

  !  "    Æ  

                                                    #         $ % Æ                                                &                  '!             

(     !  ! 

        ( )             "          



Categories and Subject Descriptors *   "    +  , - ../0121 3/0 1021 3/ .14/ 3/0 015  3 / 2 2   6   3 7  8  6/// 2/  /042   9Æ  6   :        8  ,   :       -    3   * ;  * "<   8

General Terms -

1. INTRODUCTION ;                                @   5  2A (          '    @          A    "       -                       

              "        

                                     %      >

-              

 

             

    3                                   B "    "                                     3  

%

              C   

%

           

"        637 +  88:/ /=5/. "        637 +  88:/ /=5/.  6 :/ +)/4=.1./  -           

     



    

7       

            "        

%  

           

      "   C       D

      @ 

                              A

3          

     "    

     

      @ C 

A



      ' %       >

@A

         

  -         &  

            

          

 @A

     @    

 A  



    '!         &

  @   A  "

           

   -  !      ' 

           

           



    @       A

          

 '           

@           

       

            

 

           



F

A % 

               

            

             

  > @A     G @A       

!   %      

   (            

  

 &           C

            

 @ A   

    '      



               4 )        '         E    

            @A %            @A

         '                

)       

   

 '  

      



F



     





 

             

   

             

           

      >



 F

             4

 @    A          F                      % '     

      %   

          

        

             

          

          

% "  

            

     



   

 

   '!    

       C   

        

 H          ' 



%           '!

(     

C

      (      

                  

     

          

        @BA

         &

 ) "          

'      %  

   9     ' I @

             

       

            

         



       @    



                

I @

            

"    

   



 A

  

    



 B        

 A   "       

                      

%                

     A %  "  

            @ 

   !       ) "

       A     

              

                

         ! 

         %  

 A

'            I @ 

 A

9              

(   !    I @

            

        C I @



    

 A 

          

            

    @

%        

  A

%      



 @A

@ 

!             

    A )     

              

  

% "               4        %                          #          $ %    Æ   !      '                   > ,          Æ                          

 



  @ A

 

%     





     @A      

       "           @        

A

(         C





            

 %            

@   A       +

 4 )  '  

  



      

          

  

4      

@   A    

  



     

"  4         

    C                     

    





F





 F

    





F

 G

+

0

  

      J 9    

     C

    

  

   

         

(          !    !       (      (  )         ...  /// ( '             !   !     %          '               7  7     !                7                     4    C               E          

7

                 2 1 =

7      

       

@  A 

       ! 

       

                   

                

         

       

  



 

   

7 E

 



 = B    ! "   #   $ % !"#% '    C    

 



    



F

    

            *      ! 

  

    / 0 2



 A

  '           ' I @

         %     C            

     3 0   !              *                   

2.



 

       A            /  

F

   

 C



F 



  

 

-       

     

    

  

   "   

 



&'

 (   I @



%       

 

          



   A    I @ A



    I @    @







  A (      

  '           

        @    A        '!     

              (        !

    "  

 A   "   

    "     

 A

I @

  

   '       

           ' 

    @A 







   

@

% 

F / A  ' 

    

 A   @A 

I @





 

-                    C              (    

 

 @A

    @BA              

@ A     F       

 



        



(         C I @

     (   '       

 A



  I @



 A

I @

       

        '   

 



       



     

B   

     C



  

 '             

       



    

 A

(          I @

8 

    

I @  A 

 G

 A    '      A

(    C I @

      >

         



          

  

                 F          F       

               

        

   



MAIN RESULTS

+   

  

%           

9  "          

 



    

.        

 





      

K   3 &

       

 !

  

               

5            



          



            





B         I @

%                   .

% '        

 @A

F

  @A

        C



           

 @          A 

               

            

         

  ,    



 

  7  "        



         F

      %    ' 



 

   



           

  



          

 C         



         

 @    A             @   A    @   A        @ 

       A %            @        

2.1 Analysis of @A

           



@    A 

D

  

 



      

    

 @    A 

 

@ A         

         

 @    A



F 

@ A



F

 







@/ /A  

%    



  A         L  F   L  

  A            @   A  @ L   A   L  !    

 @

@ L

  A          L  F   L  

  L A           @   A  @   L A     L  !    

 @

@

 @    A

 '           B      

@  @    A A 



 @   A             L  F   L   @ L   L A           @   A  @ L   L A   L  @F   L A !    

-   ! @

           

 @    A     

 @    A @   A (  

 @    A     F      ' 

    

 

                                                              

  A /      '

        @

  

@  @    A A

        

 



    

 /

 



 

      '  

% ! 

%           >

 @A

 @    AA

@ A F @  A

 @    A      !        @    A 7

    @    A



@    A> 



  A

     @A

  F     

 @    A F

@  A  @     AL

@   A  @     AL  @ @  A @   AA @     A    /     /





  L 

  F 







   !   

 F   @A  L     

%                     !      ' 

@ A



  F     

 @    A F

@  A  @     AL  @ @  AA @     A    /     /

%   ! @/ /A  C

%      ! @

 @ / /A F

 @/    A F /

 @    /A F

 @  /  A F

 @/ / /A F

  F @  A



   

     C



 @A   @A        !    @A       !   @  D

  @ A    



      A

  A

    

                F          >                @  A @

    A                  @   A @     A            @ @  A

@   AA  @     A    F                 7 

 

 @A



  



  F @   A

@



  A

   

      L 

  %       

       



       ! 

/     /  /   /

  

   G       

      !

  @A   

 @   A   G   

   

           

 

   

  @A

    C

D 

D



 

 !

      



   @   A    ! 

 ! D

@ @AA

F

@ @ AA     

  

           



6 



        

      !   



@           A % 

   

 

M  %         

 @    A ,!  

     Æ      

 @A >   

F

         

@ A    7   7       



   







                

   

    

 @    A D



@ A

@ A   @ NA   

    

  @  A       







   

   

      %      





   

     !

    

 







 & 7

 @     A   @ L         A  "                  











         &   6





   

   

    C @ ' A  !  

 



    

 

@ A  

  F @ A

           

 





      









          

 F  !"      





F /    ,  !     # @         A   F   !  

         F /                            

C



    

  









 

            E 



 

 



 





  "  @#@         AA   L

 

"  @# @/ /     /AA F /





%

 !    ! 



%

  " @#@   AA

 L

  

" @# @      AA F # @      A    

!   ! 

   

@ A

 F  !      

/ 6   ! !

,  !       "     ! ! %    



   

(       @ A              F     )    



@ @AA F

@ @ "

AA

  F  @



D

   

*          

 @    A F



@





 

   

@ L      L AA

  !     Æ         %  

 

0       %  "  

 @A F     @ A  

            

 A

    I @

(  

  F @ 

A

         





             

  

             



   

   



A 

! !    ! D

  





          !  



@ @AA



 

    





    @  A   @  A         @    A ,!     @  A   

 !   !  

7 0   7 2 % !      %      

  (       @  A                      

@ A F





@ A

     



     

    

   

   







    

 

  @  A



   





$ @    A F

  

 



  









  



 

     

F 



     



  L  

        



 



 



F !





  F   ! "

F

 





F

  

*    

              

   



        

 



   

2.2 Efficient algorithm for detection of a parallel episode

  



  



    



   

   

  

  







  









  







  F   ! "





F !!









  @  A

 



 

  

  





   

 

  





              

          

 @ A F

@ A



 





 





        



 D

 



   

F                F                  

 &      D !  F /                    (           7 D

  

  

 

1   



                "            "





,              >

   

   "  

  

      

"   '  >   !    Æ         %  

 @A

 ( )          Æ

@ @AA

                  

        

     (       

 @ A

         

         6  



 > 7     



$  @    A

 @    A

%  



  

$  @    A



  

        

  >



 @A

   ( )      %    

@ @AA



>

@  A  

  0   

  "        



        % 

 

         

    

  >

   "   

" >



 "      

   



   

   



%  

     

3  C           

 0 *                         



      "  "      





 D





 

        !  

                                      L  F /



 F     ! 

3        

      



  

         





%   







 @  A    

 A   F                         @  A       @ @

A 



         9  

     " I @

%       >

 A  '

'  /  A    I @  A    

             I @

 >

  



  



              '    



    '  

        "  



 

       



 

          

9                                            @

A 

!  >



   L  

     

     

%    "

        

   %    "



@ A

 A  @  A 

%  !  ' I @



    

@ A

      

%          

        C   





    F  L  

         !  

 

%" @A        %        

  

  

0

%



 2 *   I @ A    (   )  *                $       F @ A  

   



 A I @ A   " I @  A

I @



F







(









  A  @A  

M    '   '  I @          

   

    @A     @A     ) @ A

  

       

        !  

   @   A     % 51             I @  A

  

   

F F F

) @ A



 @A L         "  



     @A    F ) @  A 9         

%   



  

@    /A            '

     

%   

         



3. EXPERIMENTS

2.3 Computation of threshold  @A

%     !      

6            

           E   

I @

     %   ( ) 



 A

9   

 A F



I @

 



          9     ;

&

   8 3  * M 

...  ///  01   (      



  & F  

         01  !  .44

        

        

/



 





          '  

 @ F A %      

& 

F

" & 

F

 

 @ A 

 @A @  @AA  

        I @

 A   





 @  A

'              

&

%

         ( )        









    

 

&

F  %       !  

F 

" &   L    #$ &   &   F " & L    

@  L A

   @ A @  @AA L       L @ A

   @ A @  @AA    

     

&  !

   

G         

    

 

 

 

     !"  #    !"     $  %& #    '" %( #    !" )&  & &  *" +    )  ",

" I @  A F



%    C



F 01 (   

         %     

                       

9

"



         C    

P (w) for a parallel episode S 1

           "          ;    

0.9

        !      %   

          !    

 A  @A F   

         @ A  &      (       "        @ A @ A " F //O  

 @ A 

      I @



       

     C 

 

 0          

 @A

0.8

  0       

 @A "   ( )    

    



0.7





probability of existance: P (w)

    

0.6 ∃

Pe(w) (estimated)

0.5



P (w) (computed)

0.4

0.3

0.2

0.1

-  

        





0

0

20

40

60

80 100 window size: w

 !     '  .14        //       -        8LL    D! 

  

3.1 Estimation of @A and I @ A 



  !          

 F /

  

 

 F / 1 /    =/           A@ /  A @                / A %   @ A    I @  A@ /   A 

   @ A       @ A   % 

      %  %      7 4     !    '   &

"

  O

9                                      



-  

                    !     ) " 

 (       I @ 

 I @

 A  7 5

 A

  

F

  !   F F



                                               

     

 %   H 

              

                   !

9          



 A    %   @A  A    @A        @ A        %    ' I@ /    I@ /

' 

 @A   @A  "   0O   " F OA %    

          @





            

  



(       I @



  I @

 A  7 .

 A 

3.1.3 Comparison of the three cases: parallel,     serial and serial   !         

@A   !  @A        >        @  

   A      7  '         

           !  7                     



   7 

 @ A

      =/  A@ /  A  

F / 1 / 

   !    I @

    

             E 

    "    %   C          %   

 @A



  7 /        

3.1.2 The case of a set of two serial episodes     

 

 

        ' I @

 @A 

180

 %  @ A F   @ A   && &    '   (&)* 



F                        



160

          

  !  

 @A

140

               

3.1.1 The case a parallel episode



120



%      7 =      

  7 

% '       

           



     

        !     C  







B          

 @A F *@+ A  

  

+  

3.2 Threshold  @A % !           

 @A

      

    

 @A







   

   

     ( 

F                         

              





(    

F 1////    

 

Ω∃(105, w) and E[Ω∃(105, w)] for a parallel episode S

4

x 10

P∃(w) for a set of two serial episodes S , S 1

0.9

8

0.8

7

0.7

probability of existance: P∃(w)

9



6 ∃

5

Ω (10 , w) (actual)

5

E[Ω∃(105, w)] (computed)

4

P∃(w) (estimated)

0.2

1

0.1

20

40

60

80 100 window size: w

120

140

160

180

 @A F  @A L

        



   





 A         

    " I @

  '      

"

!

" I @  A 

F

   @I @  A

I @ AA

/



           /    (            

 

 "      



       

  



0

20

40

60

80 100 window size: w

120

140

160

180

 

 ,  @ A F   @ A    F       &   '   (&)*  -  !    "         @  A   @  A                    , @ A D @ A           @ A    F         %     

          >

    

7      (

        "  @A   

    

 

    





 @ A

F



@  @A  A @  @A    @A  A   

@  @A    @A   A

(   

   !    (  C      

0

 '   ) "       

 1////          F 2/ 0/ 1     

P (w) (computed)

0.4

2

0

e ∃

0.5

0.3

 + I @  A  I @  A   && &   '   (&)* 



0.6

3

0

2

1

5

number of windows: Ω (10 , w)

10

 @A             @A  !    @  @A    @A  A 

     

   C   

) "           

 "          "  7            

 @ A

 @ A

                        

5. REFERENCES  ) ; - ( 

         -    

 @A

 ! 

!   



  

         

@A      @A 

 ;   ) 6  7!  

        

(   #  ///

 - -   ) -   8  C     

4.

CONCLUSIONS AND EXTENSIONS

(    !        !  

 @A 

0 * B 



7      !

      Æ  

2 D + ; + + ;  : 7   P KQ

""Q

 ,    

#   ..5

              (             (    

@ @AA         ! 

 

    

Suggest Documents