Derivation of EM algorithm. The complete log-likelihood including missing data {zi} for the proposed model is. I. â i=1. J. â j=1. K. â k=1. I(zi,j = k). ( L. â l=1.
Derivation of EM algorithm The complete log-likelihood including missing data {zi } for the proposed model is I X J X K X
I(zi,j = k)
L X
i=1 j=1 k=1
log fk,l,xi,j,l + log qi,k .
l=1
Here, we introduce the variable for conditional probability for zi,j given the parameters and the mutation features xi,j , θi,k,m = Pr zi,j = k xi,j = m, {fk,l }, {qi } Note that this conditional probability just depends on the value of mutation feature m = (m1 , · · · , mL ), not on the index j. Then, the expected complete log-likelihood augmented by Lagrange multipliers is calculated as I X X i=1 m
gi,m
K X
θi,k,m
k=1
L X
log fk,l,ml + log qi,k +
l=1
K X L X k=1 l=1
τk,l 1 −
Ml X
I K X X fk,l,p + ρi 1 − qi,k .
p=1
i=1
k=1
Differentiating it leads to following stationary equations: I X X
gi,m θi,k,m − τk,l fk,l,p = 0, (p = 1, · · · , Ml , k = 1, · · · , K, l = 1, · · · , L).,
i=1 m:ml =p
X
gi,m θi,k,m − ρi qi,k = 0, (k = 1, · · · , K, i = 1, · · · , I).
m
Then, by eliminating Lagrange multipliers, updating rules can be obtained.
1