estimating dynamic graphical models from ...

2 downloads 0 Views 2MB Size Report
MOTIVATION. Sparse graphical models have several benefits over their dense counterparts: • Robust Estimation is often provided by perform-.
algorithm

E STIMATING DYNAMIC G RAPHICAL M ODELS FROM M ULTIVARIATE T IME -S ERIES A LEX G IBBERD ([email protected]) D R . J AMES N ELSON ([email protected]) P ROBLEM

M ETHOD : G ROUP -F USED G RAPHICAL L ASSO

We consider the problem of recovering the timevarying conditional dependency graph G(V t , E t ) for a multivariate Gaussian process {y t }Tt=1 , such that: y t ∼ N (0, Σt )

for t = 1, ... , T .

A LGORITHM

We enforce structure on estimate via regularised estimation over a set Alternatively, we may use indeof precision matrices. The problem of jointly estimating graphs and pendent smoothing, i.e. changes in these graphs is formulated as a convex M-estimator as beT X low. This can then be solved through an efficient Alternating Directed λ∇ kX t − X t−1 k1 . Method of Moments (ADMM) algorithm (see right).

Data: y 1 , ... , y T Input: λ1 , λ2 , γ, dual , prime , dyk 1 T ˆ , ... , Θ ˆ } Result: {Θ Calculate covariance matrix: t Sˆ = y t (y t )> /2 for t = 1, ... , T t t t Initialize: Z (0) = X (0) = U (0) = 0 while not converged (rprime ≥ prime , rdual ≥ dual ), n = 0, 1, ... do for t=1,. . . ,T do Eigen-decomposition: {sh , v h }Ph=1 =  t t t eigen Sˆ − γ(Z (n−1) − U (n−1) ) q  xh = − sh + sh2 + 4γ /2γ V = (v 1 , ... .v P ), Q = diag(x1 , ... , xP ) > t Apply constraints: X (n) = V QV end Z (n) = proxR1+R2 (X (n) + U (n−1) ; λ1 /γ, λ2 /γ) // GFLSA via Dykstras method* t t  t t U (n) = U (n−1) + X (n) − Z (n) , for t = 1, ... , T PT rprime = t=1 kX t(n) − Z t(n) k2F , rdual = PT t t 2 − Z k kZ (n−1) F (n) t=1 end t t ˆ Return: {Θ = X , ...}

t=2

t −1

In the Gaussian case, the precision matrix Θ = (Σ ) encodes the conditional dependency structure, Θti,j = 0 ⇐⇒ yit ⊥ yjt . We highlight two key problems:

T  T  X X t t t t t t−1 t Θ := arg max log(det(X )) − tr(S X ) − λG kX k1 − λ∇ kX − X kF {X t+ }Tt=1

t

Dynamic likelihood for the precision matrix, with local covariance function: X t t t > S ∝ f (t)y (y )

The graphical lasso acts to shrinking edges in the precision matrix. X t kX k1 = |Xi,jt |

t

M OTIVATION

t

S=

Sparse graphical models have several benefits over their dense counterparts:

1. Formulated changepoint and graph selection as convex M-estimation problem

X

t

1

Changepoint density (IFGL)

2

0.8 0.8

250

7

300

150

200

100

8

100

50

0 0 400 300 200 100 0 0

0.6

10

20

30

40 50 60 t Changepoints and Active Components (GFGL)

0 70

IFGL 40

60

80

10

20

30

t

40

50

60

50

10

20

30

40

t

50

60

Changepoint density (GFGL)

300

900

90

200

800

100

600

120

500

130

400

100

300

150

200

160

0.2 20

30

180

40

P

3

10

20

30

t

40

ACKNOWLEDGMENTS Data for Drosophilia example is derived from Arbeitman et al.

2002.

http://www.sdbonline.org/sites/fly/aimain/images.htm and Wikipedia.

Fruit fly images sourced from: This research is funded via the

Dstl National PhD scheme in collaboration with the UCL SECReT doctoral training centre for Security & Crime Science.

1. Utilising copulae to relax the distributional assumptions 2. Considering multi-resolution analysis of graphical models within multivariate wavelet frameworks.

100

170

10

In future, we aim to relax the modelling assumptions used here to allow for non-Gaussian distributions. Additionally, we look to build models which can incorpotT rate changes in mean structure µ = 0 → µ t=1 . We have two main research directions:

700

110

140

0.4

0

1000

80

100 10

100

9

400

0 70

150

6

200

GFGL

0.2

λ2

5

400

0.6

0.4

200

4

Changepoints and Active Components (IFGL)

F UTURE W ORK

250

3

λ2

Graph Recovery vs P (T=20)

300

1

Edges effected by changepoint

Both independent and group fusing show similar model recovery properties as we scale problem/data size. Graph Recovery vs T (P=10)

Example: Gene-Networks in Drosophila lifecycle

500

Synthetic Experiments

T

3. Demonstrated computational scaling and recovery properties of method

1

t

Edges effected by changepoint

We have assessed the performance of the proposed fused graphical estimators in both a synthetic and applied setting. Often, when group structure in dependency structure is expected GFGL performs more meanignful segmentation.

20

2. Developed efficient O(TP ) projective Alternating Method of Multipliers (ADMM) algorithm to estimate dynamic graphs

t-1

Θ=

Adult

We automatically balance model complexity with accuracy to perform joint changepoint and graph estimation for piecewise constant Gaussian Graphical Models (GGM).

X X

Pupa

K EY C ONTRIBUTIONS

Θ=

Active edges

• Parsimonious representation allows for efficient inference, and insight into system dynamics.

t-

R ESULTS

F1−score

• Generative Models permit anomaly detection and estimation of missing data, i.e. P(yit |yVt −i ).

t

f(t)

F1−score

• Robust Estimation is often provided by performing variable/feature selection within the graph. Selecting a subset of edges allows one to trade off generalisation bias/variance.

Fused graphical lasso. smooths the solution over timesteps and tries to shrink differences between graphs

i,j

Larva

2. Efficiently find a representative graphical model (naiively the number of possible graphs scales as O(2P )

t=2

Embryo

ˆ ⇐⇒ G(Vˆ t , Eˆ t ) given 1. How to robustly estimate Θ data Y = (y 1 , ... , y T ) ∈ RP×T (often P  T )

t=1

Active edges

t

Supported by:

50

60

0

R EFERENCES [1] M. Kolar, EP. Xing Estimating networks with jumps In Electronic Journal of Statistics – 2012 [2] A. Gibberd, JDB. Nelson Regularized Estimation of Piecewise Constant Gaussian Graphical Models: The Group-Fused Graphical Lasso Under Review – 2015