On the expected value and distribution function of the first exit time for the PolyaAeppli process Selen Cakmakyapan and
[email protected] [email protected]
Citation: AIP Conference Proceedings 1558, 1446 (2013); doi: 10.1063/1.4825790 View online: http://dx.doi.org/10.1063/1.4825790 View Table of Contents: http://aip.scitation.org/toc/apc/1558/1 Published by the American Institute of Physics
On The Expected Value and Distribution Function of The First Exit Time for The Polya-Aeppli Process SelenCakmakyapana and GamzeOzelb Department of Statistics, Hacettepe University 06800, Beytepe, Ankara, Turkey a
[email protected] b
[email protected] Abstract. Pólya-Aeppli process is a particular case of classical compound Poisson process where the contribution of each term is distributed according to the geometric distribution and is used for describing clustered data since the Poisson process is insufficient for clustering of events. In this study, the distribution function and expected value of the first exit time are derived forPólya-Aeppli process. Then, an application based on traffic accidents in Groningen are given and expectedtimes obtainedfor some time-independent boundaries using R project. Keywords: First exit time, Polya-Aeppli process PACS: 02.60Lj, 45.10.-b, 05,20-y
INTRODUCTION Pólya-Aeppli processor geometric Poisson process is a special case of the compound Poisson process which is derived by Getis [1]to model clustered point process. To this point in history there had been reported results in such topics as accident statistics, ecology, radiobiology, quality control, telecommunications, and other disciplines ([2], [3], [4]). In all fields the common feature is a collection of data which consists of counts of occurrences failing within discrete classes or intervals. The applications of the compound Poisson processes often run into the obstacle of numerical evaluation of the corresponding probability functions [5]. Özel and Inalderived the explicit probability function of the compound Poisson process [4]. Let {N t , t t 0} be a homogeneous Poisson process with parameter O and let Yi , i=1,2,3,... be geometric
distributed random variables with parameter T , independent of N. Then, {X t , t t 0} is called as a Pólya-Aeppli Nt
processand is defined by X t
¦Y
i
. Then, the probability function of X t is given by
i 1
k
p X (k ) t
¦ e Ot n 1
(O t ) n n!
§ k 1· n ¸¸T (1 T ) k n , k ¨¨ 1 n ¹ ©
1, 2, 3, ...
(1)
It is not easytoobtaintheexplicitprobabilitesfromEq. (1).Therefore, a recursivePanjerformulasatisfyingtherelation k Ot i p N t (n ) p N t (n 1) , n 1, 2, 3, ... wasprovedbyPanjeras p X t (0) e Ot (1pY ( 0 )) and p X t (k ) Ot ¦ p Y (i)p X (k i) , n k i 1
k 1, 2, 3, ... , p Y ( y) is thecommonprobabilityfunction of Yi , i 1, 2, 3, ... [6]. Since Panjer’sfunction is based on a recursivescheme, it causesdifficulties in computation time andcomputermemoryforthelargevalues of k [7].
11th International Conference of Numerical Analysis and Applied Mathematics 2013 AIP Conf. Proc. 1558, 1446-1449 (2013); doi: 10.1063/1.4825790 © 2013 AIP Publishing LLC 978-0-7354-1184-5/$30.00
1446
Let Yi , i 1, 2, 3, ..., be i.i.d. discrete random variables with the probabilities P Yi
j p j , j f
common probability generating function (pgf) of Yi , i 1, 2, 3, ... , is given by g Y (s)
¦p s j
j
0, 1, 2 ... The
p 0 p1s p 2s 2 ...
j 0
and the pgf of X t is given by f
g X t (s)
¦e O
t
n 0
ª Otg Y ( s ) (Otg Y ( s )) 2 º e Ot «1 ...» 1! 2! ¬ ¼
(O t ) n >g Y ( s)@n n!
e Ot > gY ( s ) 1@ .
(2)
Using Eq. (2), the explicit probability function of the compound Poisson process is derived with the help of the definition O j Op j , j 0, 1, 2, ..., for the Pólya-Aeppli processby Ozel and Inal[8]. Differentiating the pgf in Eq.(2)and substituting in g Xt (s) at s
P( X t
0)
0 as
g X t (0), P ( X t
k)
wk g X t (s) ws k k!
s 0
,k
1, 2, 3, ... ,
after some algebraic manipulations, a general formula is obtained for the Pólya-Aeppliprocess as follows:
P( X t
0 ) e Ot
P( X t
1) e Ot
P( X t
( O1t ) 1! ª ( O t )2 ( O t ) º 2 ) e Ot « 1 2 » 1! ¼ ¬ 2!
(3)
P( X t
ª ( O t )3 ( O t )( O2 t ) ( O3t ) º 3 ) e Ot « 1 1 » 1! 1! 1! ¼ ¬ 3!
P( X t
ª ( O t )4 ( O t )2 ( O2 t ) ( O1t )( O3t ) ( O2 t )2 ( O4 t ) º 4 ) e Ot « 1 1 » 2! 1! 1! 1! 2! 1! ¼ ¬ 4!
Let us pointed out that O j is defined in Eq. (3) to obtain probabilities of the Pólya-Aeppli process as Oj
OT(1 T) j1 , j 1, 2, 3,...
THE DISTRIBUTION FUNCTION OF THE FIRST EXIT TIME FOR THE PÓLYAAEPPLI PROCESS Nowconsider a time-independentboundary E forEq. (4) thenthefirstexit time 7 is defined
7
inf{t : X t t E }
(4)
where 0 t f and E ! 0 . 7 can be described as thefirstinstant at which a samplepathcrosses (jumpsover) theboundary E . TheLaplaceStieltjestransforms of thedistributionfunction of 7 withpositivejumpsweregivenby Baral.[9]forthecompoundPoissonprocess {X t , t t 0} where Yi , Levet i 1, 2, ..., werecontinuousrandomvariables.
1447
However, theexplicitformulaforthedistributionfunction of thefirstexit time has not beenobtainedwhere Yi , i 1, 2, ..., arediscreterandomvariables. Let {N t , t t 0} be a homogeneousPoissonprocesswithparameter O ! 0 and Yi , discreterandomvariableswithvalues j 0, 1, 2,... Then, thedistributionfunction of 7 is P( 7 d t )
F7 ( t )
i 1, 2, 3, ..., are
i.i.d.
P( X t t E )
1 P( X t d E 1 )
(5)
E 1
1 ¦ p X t ( k ) 1 FX t ( E 1 ) k 0
where FX t (k ) is thedistributionfunction of X t . Then, thecumulativeprobabilities of X t is obtained as FX t ( 0 )
e O t( 1 p0 )
FX t ( 1 )
FX t ( 0 ) FX t ( 0 )
FX t ( 2 )
where O j
( O1t ) 1! ª ( O1t )2 ( O2 t ) º FX t ( 1 ) « » 1! ¼ ¬ 2!
(6)
FX t ( 3 )
ª ( O t )3 ( O1t )( O2 t ) ( O3 t ) º FX t ( 2 ) « 1 » 1!1! 1! ¼ ¬ 3!
FX t ( 4 )
ª ( O t )4 ( O1t )2 ( O2 t ) ( O1t )( O3 t ) ( O2 t )2 ( O4 t ) º FX t ( 3 ) « 1 1! »¼ 2!1! 1!1! 2! ¬ 4!
OT(1 T) j1 , j 1, 2, 3,...
THE EXPECTED VALUE OF FIRST EXITTIME FOR THE COMPOUND POISSON PROCESS In this section we derived the expected value of the first exit time for the compound Poisson process. Since the f
random variable T has positive values, the formula is given by E[T ]
³ ( 1 F ( t ))dt where T
FT (t ) is the
0
distribution function of the first exit time. Using Equations (5) and (6), the expected values for the some values of time-independent boundary E is obtained by 1
E
1 o E1 ( T )
E
2 o E2 ( T )
E1 ( T ) > E1 ( T )@ O1
E
3 o E3 ( T )
E2 ( T ) > E1 ( T )@ O2 > E1 ( T )@ O12
E
4 o E4 ( T )
E3 ( T ) > E1 ( T )@ O3 > E1 ( T )@ 2O1 O2 > E1 ( T )@ O13
E
5 o E5 ( T )
E4 ( T ) > E1 ( T )@ O4 > E1 ( T )@ ( 2O1O3 O22 ) > E1 ( T )@ ( 3O12 O2 ) > E1 ( T )@ O14
E
6 o E6 ( T )
E5 ( T ) > E1 ( T )@ O5 > E1 ( T )@ ( 2O2 O3 2O1O4 ) > E1 ( T )@ ( 3O12 O3 3O1O22 )
7 o E7 ( T )
> E1 ( T )@ ( 4O13 O2 ) > E1 ( T )@ O15 2 3 4 E6 ( T ) > E1 ( T )@ O6 > E1 ( T )@ ( 2O1O5 2O2 O4 O32 ) > E1 ( T )@ ( 3O12 O4 6 O1O2 O3 O23 ) 5 6 7 > E1 ( T )@ ( 4O13 O3 6 O12 O22 ) > E1 ( T )@ ( 5O14 O2 ) > E1 ( T )@ O16
O( 1 p0 ) 2
2
2
2
2
5
E
3
3
4
3
4
3
5
4
6
1448
(7)
Note that O j
OT(1 T) j1 , j 1, 2, 3,... for the Pólya-Aeppli process in Eq. (7) and a R code is prepared for Eq. (7).
APPLICATION A model was constructed with the Pólya-Aeppli process for the costs of traffic accidents for Dutch motorists byVan der Laan and Louther [10]. Meintanis[11]used the accident data and fatalities in Groningen for the bivariate distributions. The data were obtained from the database of BRON of the Ministry of Transport, Groningen. In particular, total accidents and fatalities recorded on Sundays of each month over the period 1997-2004 in the region of Groningen. Inthisstudythesamedata is usedtoshowapplicability of thePolya-Aeppliprocessandtoexplainthe total number of fatalitieswithfollowingrandomvariables: N t is thenumber of Sunday accidentswhichoccurs in Groningen betweenyears 1997-2004; Yi , i 1, 2, 3,... arethenumber of fatalities of ithaccidentsuchthat i 1, 2, 3,... ; X t
Nt
¦Y
i
i 1
is the total number of fatalities in the time interval (0, t ] . ThehomogeneousPoissonprocessprovided an adequate fit tothenumber of Sunday accidents ( p value 0.01, F 2
2.94 ) for O 9.84 (month). Theindependency of Yi , i 1, 2, 3, ... and {N t , t t 0} is shownwithSpearman’s U test (Spearman’s U 0.084; p 0.432). Then, thegoodness of t test is performedandthe geometric distribution with parameter T 0.62 significantly fit the data. Then, the probabilities P(X t k ) , k 0, 1, 2,..., of the total number of fatalities within t 1, 2, 3 months are computed from Eq. (3). The distribution functions of 7 for E 5 and E 10 with several values of t are given where Yi(4) , i 1, 2, 3, ..., are geometric distributed with parameter T 0.13 . We obtain that total number of fatality is five up to two months and ten up to five months. Finally, the expected values of several first exit times are obtained using Eq. (7) and the results are presented in Table 1. TABLE 1. Expected times for some number of fatalities
E
1
2
3
4
5
6
7
E[ T ]
0.26744
0.43325
0.59906
0.76487
0.93068
1.09649
1.26230
REFERENCES 1. A. Getis,Proceedings of the 1972 meeting of the IGU Commission on Quantitative Geography, McGill-Queens University Press, Montreal, 1974. 2. S. Robin, Applied Statistics,51,437-451 (2002). 3. R. J. Rosychuk, C. Huston,andN. G. N. Prasad, Biometrics,62, 465-470 (2006). 4. G. Ozel, and C. Inal, Journal of Statistical Computation and Simulation,80, 479-487 (2010). 5. A.G. Belov and V.Y.Galkin,Computational Mathematics and Modeling,18, 253-262 (2007). 6. H. Panjer,ASTIN Bulletin,12, 22-26 (1981). 7. T. Rolski, H.Schmidli, V. Schmidt and J. Teugels, Stochastic Processes for Insurance and Finance, John Wiley & Sons, Hoboken,1998, pp. 205-374. 8. G. Ozel and C. Inal, Bulletin of Statistics and Economics, 2,70-79 (2008). 9. S. Bar-Lev, D. Bshouty, D. Perry, S. and Zacks,Stochastic Models,15,89-101 (1999). 10. B.S. Van Der Laan, and A.S. Louter,The Statistician,35,163-174 (1986). 11. S.G. Meintanis,Statistical Methodology, 4, 22-34 (2007).
1449