EM Algorithm for Truncated and Censored Poisson ... - Science Direct

Available online at www.sciencedirect.com

ScienceDirect Procedia Computer Science 86 (2016) 240 – 243

2016 International Electrical Engineering Congress, iEECON2016, 2-4 March 2016, Chiang Mai, Thailand

EM Algorithm for Truncated and Censored Poisson Likelihoods Chukiat Viwatwongkasem * Department of Biostatistics, Faculty of Public Health, Mahidol University, Bangkok 10400, Thailand.

Abstract The aim of this study is to find the maximum likelihood estimate (MLE) among frequency count data by using the expectationmaximization (EM) algorithm in which is useful to impute the missing or hidden values. Two forms of missing count data in both zero truncation and right censoring situations are illustrated for estimating the population size on drug use. The results show that a truncated and censored Poisson likelihood performs well with good estimates corresponding to the EM algorithm with a numerically stable convergence, a monotone increasing likelihood, and providing local maxima, so the expected global maximum of the MLE depends on the initial value. © 2016 2016The TheAuthors. Authors.Published Published Elsevier by by Elsevier B.V.B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Organizing Committee of iEECON2016. Peer-review under responsibility of the Organizing Committee of iEECON2016 Keywords: EM algorithm; Truncated and Censored Count; Imputation; Population Size Estimation

1. Introduction Expectation-Maximization (EM) algorithm is an efficiently iterative procedure for computing the maximum likelihood estimate (MLE); it can be applied not only in the presence of actual missing or hidden data, but also in the whole variety of complete situations. Dempster, Laird, and Rubin (1977) 1 gave the initial publication in Royal Statistical Society journal. A gap of the study is based on a reason that the EM algorithm is a useful method for solving the problem of incomplete data while other algorithms such as Newton-Raphson and Fisher Scoring method cannot be able to impute the unobserved (missing) data. Two applications of missing count data including the zero truncation and the right censoring are illustrated in the estimation of population size on drug use. 2. Methods

* Corresponding author. Tel.: +66-2-354-8530; fax: +66-2-354-8534. E-mail address: [email protected]

1877-0509 © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Organizing Committee of iEECON2016 doi:10.1016/j.procs.2016.05.109

241

Chukiat Viwatwongkasem / Procedia Computer Science 86 (2016) 240 – 243

Let y ( y1,..., yn )c be an incomplete observed data vector of size n from the population function f ( y; θ) where θ (T1 ,...,T p )c is a vector of p unknown parameters. When some parts of the data are missing with unknown vector z , it is necessary to “fill in” the missing data z , leading to the complete data vector x (yc,zc)c . Then MLE via EM algorithm is determined under the complete-data log-likelihood l (θ; x) after imputation. In the E-step (expectation step), we calculate the expected value of the complete log-likelihood l (θ; x) with respect to the conditional distribution of z given the observed data vector y and the current estimate of the parameter vector θ( k 1) at the (k 1)th iteration: E ª¬l (θ;x) | y,θ( k 1) º¼ { Q(θ | θ( k 1) ) . In the M-step (maximization step), EM algorithm will maximize Q(θ | θ( k 1) ) with respect to θ to give an update value θ( k ) until convergence with an acceptable error. Accordingly, the MLE θˆ is obtained by choosing θ( k ) to be any value of θ Θ that maximizes Q(θ | θ( k 1) ) ; additionally, θ( k ) max Q(θ | θ( k 1) ) . θ

3. Motivational applications Viwatwongkasem et al. (2013) 2 projected the number of drug users in Thailand 2005-7 under surveillance data on the drug addicts undergoing treatment in the country over 1,140 health treatment centers where i is the count of treatment episodes for i 1,2,..., m and ni is the number of persons receiving treatment episode i , and sample size is n n1 n2 ... nm . Example 1 is a typical form of zero truncation while Example 2 includes right censoring. Example 1. Observed counts of treatment episodes on marijuana users in Thailand 2006 0 1 2 Treatment episodes in a case ( i ) 5,445 1,025 Number of cases (Frequencies n )

3 158

i

Example 2. Observed counts of treatment episodes on heroin users in Thailand 2005 0 1 2 Treatment episodes in a case ( i ) 3,057 791 Number of cases (Frequencies n )

3 351

i

4 107

4 21

5 80

5 1

6 59

7+ 22

4. Results for example 1 (zero truncation situation) The incompletely observed likelihood relative to zero-truncated count frequencies (n1, n2 ,..., nm )c y is m Po(i | O ) exp(O ) O i / i ! L(O; y) pic n where pic pic(O ) 1 Po(0 | O ) 1 exp(O ) i 1 where density pic is assumed to be a zero-truncated Poisson. Suppose that unknown missing frequency vector z (n0 )c is replaced by its conditional expectation e (e0 )c given the observed frequencies n1 , n2 ,..., nm and the current value of O : e0 E (n0 | O, n1,..., nm ) . To proceed with the EM context, the complete data vector x (e0 , n1,..., nm )c is needed to use. After adding up e0 , the zero-truncated Poisson likelihood with density pic should be changed to the simply complete Poisson likelihood with density pi , leading to its log-likelihood as i

Lcd (O; x)

m

p

i

ni

e

n

n

p0 0 p1 1 ... pm m

where

pi

Po(i | O )

i 0

lcd (O ; x)

m

e0 log p0 ¦ ni log pi i 1

In the E-step, the expected value e0 e0

? e0

exp(O ) O i i!

m

e0 log Po(0 | O ) ¦ ni log Po(i | O ) i 1

E (n0 | O , n) under the Poisson density p0 can be obtained as

E(n0 | n1, n2 ,..., nm , O )

p0 N

Po(0 | O ) N where N

Po(0 | O ) >e0 @ Po(0 | O) >n1 n2 ... nm @

Po(0 | O ) > n1 n2 ... nm @ 1 Po(0 | O )

e0 n

§ exp( O ) · ¨ ¸n © 1 exp( O ) ¹

The complete log-likelihood lcd (O; x) with the expectation and the observed data can be rewritten as

242


Q (O )

m

e0 O ¦ ni > O i log O log (i !) @

lcd (O ; x)

i 1

1 e0 n

In the M-step, the derivative of Q(Oˆ) lˆcd (O; x) and setting the result to 0 yield Oˆ Thompson estimator for estimating a population size is Nˆ

m

¦i n

i

. The Horvitz-

i 1

§ exp(O ) · ¨ ¸n . © 1 exp(O ) ¹

n e0 where e0

EM algorithm Step 0 Choose initial value Oˆ (0) , and set k 0 § Po(0 | Oˆ ( k ) ) · n ¨¨ ˆ ( k ) ¸¸ © 1 Po(0 | O ) ¹

§ exp(Oˆ ( k ) ) · and Nˆ ( k 1) n ¨¨ ˆ ( k ) ) ¸¸ 1 exp( O © ¹

e0( k 1) n

Step 1

Compute e0( k 1)

Step 2

Use complete data e0( k 1) , n1 ,..., nm to compute the new MLE Oˆ ( k 1)

Step 3

1 ( k 1) 0

e

m

¦i n n

i

,

i 1

Set k k 1 and go back to Step 1. The step 1 and 2 are repeated until convergence.

Table 1 Estimates of population size on marijuana users in Thailand 2006 ( n 6650 ) Methods Number of iterations Taylor’s series approximation Newton-Raphson algorithm Fisher scoring algorithm EM algorithm Chao’s method

Oˆ

Closed form 9 7 97 Closed form

nˆ0

0.349 0.397 0.397 0.397 0.376

e0

15897 13635 13635 13635 14462

Nˆ 22547 20285 20285 20285 21112

5. Results for example 2 (including right censoring situation) Let original observed count frequencies n1, n2 ,..., nm without n0 have a multinomial density n § · n exp(O )O j / j ! n n pcm where pcj ¨ ¸ p1c p2c m nm ¹ © n1 ¦ x 1 exp(O )O x / x! 1

2

m

Consider arbitrary right censoring count J in x 1,2,..., J where 2 d J d m . The incompletely observed likelihood relative to frequencies (n1, n2 ,..., nJ )c y is obtained as J

pc

L (O ; y )

nj

j

where pcj

pcj (O )

j 1

exp(O ) O j / j !

¦

J x 1

exp(O ) O x / x!

Suppose unknown missing frequency vector z (n0 , nJ 1,..., nm )c is replaced by its conditional expectation e (e0 , eJ 1,..., em )c given the observed frequencies n1, n2 ,..., nJ and the current value of O where ex E(nx | O, n1, n2 ,..., nJ ) . The complete data set is x (e0 , n1,..., nJ , eJ 1,..., em )c . The multinomial likelihood with probability of success pcj should be changed to the complete Poisson likelihood with density p j as Lcd (O; x)

m

p

ni

e

n

nJ

p0 0 p1 1 ... pJ

i

e

e

pJ J11 ... pm m where pi

Po(i | O )

i 0

lcd (O ; x)

J

e0 log p0 ¦ ni log pi i 1

exp(O )O i i!

m

¦ e log p

i J 1

i

i

In the E-step, the expected vector e (e0 , eJ 1,..., em )c where ex E(nx | O, n1, n2 ,..., nJ ) for x 0 or x ! J under the Poisson density px can be obtained as ex E(nx | n1, n2 ,..., nJ , O ) px N Po( x | O ) N where N e0 n1 ... nJ eJ 1 ... em ex ? e0

Po( x | O ) >e0 n1 ... nJ eJ 1 ... em @

m

¦ e > Po(0 | O ) Po( J 1| O ) ... Po(m | O )@>e

x J 1

x

0

n1 ... nJ eJ 1 ... em @

(1)

243


ª1 ¦ J Po( x | O )º >e0 n1 ... nJ eJ 1 ... em @ x 1 ¬ ¼

? ex

Finally, replace (2) into (1),

1 ¦ x 1 Po( x | O ) J

m

¦e

? e0

x J 1

¦

x

Po( x | O )

J x 1

> n1 ... nJ @

§ · ¨ JPo( x | O ) ¸ > n1 ... nJ @ for x ¨ ¦ Po( xc | O ) ¸ © xc 1 ¹

(2)

0 or x ! J

The complete log-likelihood lcd (O; x) with the expectation and the observed data can be rewritten as J

e0 O ¦ n j > O j log O log ( j !)@

Q(O ) lcd (O; x)

j 1

m

¦ e >O x log O log ( x!)@

x J 1

x

In the M-step, the derivatives of Q(Oˆ) lˆcd (O; x) and setting the result to 0 yield Oˆ

1 e0 ¦ j 1 n j ¦ x J

m

§ ¨ e J 1 x ©

j 1

j

m

¦

x J 1

· x ex ¸ ¹

§ · ¨ JPo(0 | O ) ¸ > n1 ... nJ @ . ¨ ¦ Po( xc | O ) ¸ © xc 1 ¹

m

e0 ¦ ni where e0

The population size estimator is Nˆ

J

¦ jn

i 1

EM algorithm Step 0 Choose initial value Oˆ (0) , and set k 0 §

Step 1 Step 2

Po( x | Oˆ ( k ) ) ·¸ > n ... nJ @ for x 0 , x ! J and Nˆ ( k 1) ¨ ¦ Po( xc | Oˆ ( k ) ) ¸ 1 c © x 1 ¹

Compute ex( k 1) ¨

( k 1) 0

Use complete data e Oˆ ( k 1)

Step 3

, n1 ,..., nJ , e

1 e0( k 1) ¦ j 1 n j ¦ x J

m

e0( k 1) ¦ ni

J

m

( k 1) J 1

§ ¨ ( k 1) e © J 1 x

... e J

¦ jn j 1

j

( k 1) J 1

i 1

to compute the new MLE

m

¦ xe

x J 1

( k 1) x

· ¸, ¹

Set k k 1 and go back to Step 1. The step 1 and 2 are repeated until convergence.

Table 2 Estimates of population sizes on heroin users in Thailand 2005 ( n Methods ˆ

O

EM algorithm Chao’s method

0.9468 0.5175

4467 )

nˆ0

e0

2818 5907

Nˆ 7285 10374

6. Discussion EM algorithm provides monotone increasing likelihood and contributes to numerically stable convergence. However, it may converge slowly; it is like other algorithms, such as Newton-Raphson, providing local maxima, so the final MLE as the global maximum may depend upon the initial value. Acknowledgements This study was partially supported for publication by the China Medical Board (CMB), Faculty of Public Health, Mahidol University, Bangkok, Thailand. References 1. Dempster AP, Laird NM, Rubin DB. Maximum likelihood estimation from incomplete-data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B 1977; 39: 1-38. 2. Viwatwongkasem C, Satitvipawee P, Jareinpituk S, Soontornpipit P. Mixture models for estimating the number of drug users in Thailand 2005-2007. Applied Mathematics 2013; 4: 1242-1250.

EM Algorithm for Truncated and Censored Poisson ... - Science Direct

EM Algorithm for Truncated and Censored Poisson ... - Science Direct

Suggest Documents

Poisson Integrators - Science Direct

A Stochastic EM Algorithm for Progressively Censored Data Analysis

A Stochastic EM Algorithm for Progressively Censored Data Analysis

Estimation and Model Selection for Left-truncated and Right-censored

Em~f1c - Science Direct

pdf-1460\survival-analysis-techniques-for-censored-and-truncated ...

[PDF] Survival Analysis: Techniques for Censored and Truncated Data ...

p.d.f Survival Analysis: Techniques for Censored and Truncated Data ...

Censored Data and Truncated Distributions - NYU Stern School of ...

Heteroscedastic Censored and Truncated Regression ... - The R Journal

zero-truncated discrete two- parameter poisson ...

Differential Search Algorithm for Multiobjective ... - Science Direct

(Â±1)-Invariant sequences and truncated Fibonacci ... - Science Direct

algorithm theories and design tactics - Science Direct

Zero-Truncated Poisson Tensor Factorization for Massive ... - Duke ECE

Exact conservation laws for truncated gyrokinetic Vlasov-Poisson ...

Zero-Truncated Poisson Tensor Factorization for Massive ... - Duke ECE

Truncated-Newton training algorithm for neurocomputational ... - People

O epitélio respiratório em ratos Wistar nascidos em ... - Science Direct

Direct Acoustic Feature Using Iterative EM Algorithm and Spectral ...

O epitélio respiratório em ratos Wistar nascidos em ... - Science Direct

Segmentation algorithm for non-stationary compound Poisson ...

Segmentation algorithm for non-stationary compound Poisson ...

Rescue Route Reselection Model and Algorithm for ... - Science Direct