A data-driven modeling approach to identify disease-specific ... - PLOS

2 downloads 0 Views 3MB Size Report
S3 File S3 File.pdf : This file contains normalized gene expression data plotted along with ... S4 File S4 File.xlsx: This file contains names for parameters and dynamic variables. ..... MOCCASIN: converting MATLAB ODE models to SBML.
A data-driven modeling approach to identify disease-specific multi-organ networks driving physiological dysregulation Warren D. Anderson1Y¤ , Danielle DeCicco1Y , James S. Schwaber1 , Rajanikanth Vadigepalli1* 1 Daniel Baugh Institute for Functional Genomics and Computational Biology, Department of Pathology, Anatomy, and Cell Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA YThese authors contributed equally to this work. ¤Current Address: Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA * Corresponding author: [email protected]

Supporting Information • Supplementary Files • Supplementary Figures • Supplementary Table

PLOS

1/21

Supplementary Files S1 File values).

S1 File.csv : This file contains un-normalized raw gene expression data (Ct

S2 File

S2 File.csv : This file contains normalized gene expression data.

S3 File S3 File.pdf : This file contains normalized gene expression data plotted along with model simulation traces. S4 File S4 File.xlsx : This file contains names for parameters and dynamic variables. As an example to describe out parameter label convention, the interaction coefficient denoting the directed influence in which gene g2 from organ r2 regulates gene g1 in organ r1 is labled k r1g1 r2g2 (i.e., k to from). Initial conditions are included in another tab. SHR denotes the spontaneously hypertensive rat (autonomic dysfunction) and WKY denotes the Wistar Kyoto control phenotype. S5 File S5 File.xml : This file contains the dynamic model for the autonomic dysfunction phenotype in the systems biology markup language (SBML) format. The model was converted from Matlab to SBML using MOCCASIN [1]. S6 File S6 File.xml : This file contains the dynamic model for the control phenotype in the systems biology markup language (SBML) format. The model was converted from Matlab to SBML using MOCCASIN [1]. S7 File S7 File.mat: This file contains the parameter values and initial conditions, along with some other basic information for simulating the autonomic dysfunction and control models in matlab. S8 File

S8 File.m: This file contains matlab simulation code.

S9 File S9 File.RData: This file contains the parameter values and initial conditions for simulating the autonomic dysfunction and control models in R. S10 File

PLOS

S10 File.R: This file contains R simulation code.

2/21

Supplementary Figures B

0

2

4

6

8

Count

10

12

14

A

2

4

6

8

10

12

Stability rank for median

C

PC2 (17%)

10

● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ●●

-5

-10

Adrenal

● ●

E

Ventricle

5

● ●

0

● ● ●● ● ● ● ● ●

● ●●● ●● ● ●● ●● ● ●● ● ● ●● ● ● ●●

Ace Il1a Agt 0

● ●

−5

● ●



● ●

0

5

5



● ●

● ● ● ● ●

● ● ● ● ● ● ● ● ●





15

1



5

10

● ● ●

15

● ● ● ●

● ●

● ● ● ● ● ●

● ● ● ● ●

● ● ●

● ● ● ●



● ● ●

● ● ● ●

● ● ● ● ●

● ● ●

● ● ●

● ●

−2



● ● ● ●

0

2

10

2 0

● ● ●





● ●

● ●

−6

Agtrap Ren Il1b





● ● ●

● ● ●

● ●

−2

●● ● ●●● ● ●

Cxcr3 Ace Il1a



2

● ● ●

● ●

−2







0

2 1 0

● ● ● ●



−2

5

● ● ● ● ●



−4

● ● ● ● ● ●

●● ● ●● ● ● ●●● ● ● ● ●●● ●● ●● ●● ● ● ● ● ● ● ●



−6

Kidney ● ● ● ●

Ventricle ● ●

Il1a expression (−ddCt)

5

● ● ● ● ● ●

5



0

0

0

PC1 (22%)

PC2 (17%)

Adra1a Cxcr3 Ccl5

−5

Hmgb1 Agtrap

PC1 (31%)

PC2 (21%)

Agtrap Cacna1d −5

● ● ● ●● ● ● ● ● ● ●● ● ● Gja1

−5

−5

● ● ● ●

● ● ●● ●● ● ● ● ● ● ●● ● ●

−5

●● ● ●● ●● ● ● ●● ●● ● ● ●● ● ● Ren ● ● ● ● ● ● ● ● ● ● Adra1a ● ● ● ● ● Ccl5 ● ● ● ● ● ● ● Agtr1



3

4

4

Adrenal

3

5

5 0

● ● ● ● ● ●

−5

PC2 (24%)

Brainstem ● ●

0

5

PC2 (19%)

0

PC1 (40%)

−5

−5

●● ●

10

−4

● ● ● ● ● ●

● ●

Liver



5

0

PC1 (20%)



Hmgb1 expression (−ddCt)

5 0

0 -5

-10 Hmgb1 Tgfb1 Agtrap Agtr1 Agt ● ● Ren ●

−5

PC2 (21%)

D

5



5

10

Age (wk)

15

● ●

5

10

Age (wk)

15

PC1 (28%)

Adra1a Adrb2 Ren 5

PC1 (23%)

Fig S1. Sampling, normalization, and outlier evaluation. (A) Table detailing the amimal sampling and organs utilized in our analysis for each animal. (B) Stability ranks of median expression values were considered for each organ/age combination. For the majority of organ/time combinations, the median expression level was ranked among the most stable (≤ 12/22), in comparison with the stability levels for individual genes. (C) PCA was applied to the entire data set (all genes/organs) and plotted along with the variability accounted for by the first two PCs. The smooth circle shows the 99% confidence interval for the mean of a bi-variate Gaussian distribution characterized by the displayed data. Note that this interval contains the majority of the data, and the few value outside of this interval are in close proximity. (D) PCA was implemented separately for each organ. Specific color refer to the same animals in all plots. For instance, the three gray dots in the Adrenal PCA plot refer to three animals that are relatively distant from the other animal samples in this analysis. However, observation of the PC projections of these specific animals in the PCAs applied to the data from other organs shows that these animal samples are not imposing consistent biases. Panel (E) shows sample expression data labeled as in (D) for animal samples marked in the Adrenal and Ventricle PCAs.

PLOS

3/21

25 20 15 10 0

5

Log (Jsim)

30

35

α=0 α = 0.2 α = 0.4 α = 0.6 α = 0.8 α=1

Autonomic dysfunction

−10

−5

0

5

−5

0

5

Log (λ)

Control

25 20 15 10 0

5

Log (Jsim)

30

35

−15

−15

−10

Log (λ)

Fig S2. Robustness of regularized regression-based system identification. Error between simulated gene expression levels and experimentally measured mean expression values varies minimally with respect to regularization parameters. Log error is plotted with respect to the log λ value for a range of α levels.

PLOS

4/21

Kbest < 0 and Kcomp > 0

Kbest > 0 and Kcomp > 0

Kbest < 0 and Kcomp < 0

Kbest > 0 and Kcomp < 0

Kbest < 0 and K comp > 0 Kbest > 0 and K comp > 0 Odds ratio = Kbest < 0 and K comp < 0 Kbest > 0 and K comp < 0

Fig S3. Evaluation of sign consistency of interaction coefficiencts across multiple iterations of system identification. The equation illustrates the computation of the odds ratio based on the contingency table.

PLOS

5/21

100 0

50

Count

150

200

Added

−0.2

0.0

ΔE

0.1

0.2

100 0

50

Count

150

200

Removed

−0.1

−0.2

0.0

ΔE

0.1

0.2

0.3

0

5

Count

10

15

Switched

−0.1

−0.4

−0.2

0.0

ΔE

0.2

0.4

Fig S4. Differential network analysis of changes in gene-gene interactions in autonomic dysfunction. Black bars correspond to edges considered to be differentially regulated in autonomic dysfunction.

PLOS

6/21

- log q 1

2

3 Il1b Il1b Tgfb1 Tgfb1 Il1b Il1a Il1a Il6Tgfb1 Il6 Il1a Il10 Il10 Il6 Ccl5 Ccl5 Il10 Hmgb1 Hmgb1 TnfCcl5 Tnf Hmgb1 Cxcr3 Cxcr3 ThTnf Th Cxcr3 Dbh Dbh Th Adrb2 Adrb2 Dbh Adrb1 Adrb1 Adrb2 Adra1a Adra1a Adrb1 Adra2a Adra2a Adra1a Cacna1d Cacna1d Adra2a Gja1 Gja1 Cacna1d Ace Ace Gja1 Agt Agt Ace Agtr1 Agtr1 Agt Ren Ren Agtr1 Agtrap

Adrenal

Brainstem

Kidney

Liver

Ventricle

Agtrap Ren A t

Fig S5. Timeseries analysis of gene expression dynamics. Many genes showed significantly different expression patterns between autonomic dysfunction and control phenotypes (q < 0.1, -log q > 1).

PLOS

7/21

60 40 0

10

20

30

Count

50

60 50 40 30

Count

20 10 0 0.6

0.7

0.8

0.9

1.0

0.6

0.70 0.75 0.80 0.85 0.90 0.95 1.00 −15

−10

−5

0

0.9

1.0

Control

log(λ)

α=0 α = 0.2

0.8

0.70 0.75 0.80 0.85 0.90 0.95 1.00

Autonomic dysfunction

B

0.7

Spearman rank correlation

Spearman rank correlation

Spearman rank correlation

Spearman rank correlation

Control

70

Autonomic dysfunction

70

A

−15

−10

−5

0

log(λ)

α = 0.4 α = 0.6

α = 0.8 α=1

Fig S6. Correlational analysis of system identification robustness. High correlations (> 0.7) between identified networks were observed over an expansive range of regularization parameter space. (A) Spearman rank correlation coefficient histogram and (B) Correlation values as a function of regularization parameter values for λ and α.

PLOS

8/21

α=0 α = 0.2 Autonomic dysfunction

3.0 2.5 2.0 1.0

−5

0

−15

−10

−5

0

−5

0

−5

0

0.5 0.4 0.3 0.2 0.0

0.1

0.1

0.2

0.3

0.4

Clustering coefficient

0.5

log(λ)

0.0 −15

−10

−5

0

−15

−10

15 10 0

0

5

10

15

Power law exponent

20

log(λ)

20

log(λ)

5

Clustering coefficient

−10

log(λ)

B

Power law exponent

1.5

Average path length

2.5 2.0 1.5 1.0 −15

C

α = 0.8 α=1 Control

3.0

A Average path length

α = 0.4 α = 0.6

−15

−10

log(λ)

−5

0

−15

−10

log(λ)

Fig S7. Graph theoretic analysis of network identification robustness. (A) Path length, (B) clustering coefficients, and (C) power law exponents are shown for a range of regularization parameters.

PLOS

9/21

A

Autonomic dysfunction

Control organ Adrenal Brainstem Kidney Liver Ventricle

B

Autonomic dysfunction

Control

Fig S8. Graphical representations of network interactions. (A) Phenotype-specific multi-organ networks. (B) Subnetworks including interactions between the brainstem and adrenal gland.

PLOS

10/21

Autonomic dysfunction

organ

functional process

Adrenal Brainstem Inflam Kidney ANS RAS Liver Ventricle

Fig S9. Network illustrating influences of the brainstem on the other organs in the autonomic dysfunction phenotype. Note that the nodes are organized as in Fig S10 for comparison.

PLOS

11/21

Control

organ

functional process

Adrenal Brainstem Inflam Kidney ANS RAS Liver Ventricle

Fig S10. Network illustrating influences of the brainstem on the other organs in the control phenotype. Note that the nodes are organized as in Fig S9 for comparison.

PLOS

12/21

Autonomic dysfunction

Expression profile

4

organ functional Adrenal process Brainstem Inflam Kidney ANS RAS Liver Ventricle

8

Age (wk) 12

16 gene 4

Control 8

Age (wk) 12

16

Cxcr3 Cxcr3

Cxcr3

Cxcr3

Normalized expression 0

1

Fig S11. Organized sequence of gene expression valleys in autonomic dysfunction. Expression profiles were organized according to the sequence of valleys observed for the autonomic dysfunction phenotype (left).

PLOS

13/21

Autonomic dysfunction

Control 4

8

Age (wk) 12

16 gene

4

8

Age (wk) 12

16

Expression profile

Cxcr3

organ functional Adrenal process Brainstem Inflam Kidney ANS RAS Liver Ventricle

Cxcr3

Cxcr3

Normalized expression 0

1

Fig S12. Organized sequence of gene expression peaks is disrupted in the autonomic dysfunction phenotype. Expression profiles were organized according to the sequence of peaks observed for the control phenotype (left).

PLOS

14/21

Control 8

Age (wk) 12

16 gene

4

8

Age (wk) 12

16

Expression profile

4

Autonomic dysfunction

Normalized expression organ functional Adrenal process Brainstem Inflam Kidney ANS RAS Liver Ventricle

0

1

Fig S13. Organized sequence of gene expression valleys is disrupted in the autonomic dysfunction phenotyppe. Expression profiles were organized according to the sequence of valleys observed for the control phenotype (left).

PLOS

15/21

Adrenal gland Brainstem

Peak time (wk)

B 16

Autonomic dysfunction Control Liver Left ventricle

12

Il1b Dbh

8

Adrb2

Kidney

Agtrap

Agtr1

Gja1 Gja1

Th

Peak time (wk)

A 16

Il1a Gja1

12 Adrb1 Agtr1

8

Cacna1d

Ace Gja1 Il1bCacna1d Agt DbhTh Ren

Adra1a

Cacna1d Il10 Il10 Ccl5 Tgfb1 Dbh Il1b

Adrb1

4

Adra1b Agt Il1a

6 8 10 12 Peak time (wk)

C 16

14

4

Agtr1 Adra1b

Th

4 4

Il1a Hmgb1

6 8 10 12 14 Valley time (wk)

D 16

Agtr1 Adra1a

Dbh Tnf

12

Ace

Valley time (wk)

Valley time (wk)

Adra1b Adrb1

Ace

8

Il1b Adra1bCxcr3 Agtrap Adrb2 Tgfb1 Adrb1 Cxcr3 Tnf Agtr1Cxcr3 Il10 Il10

Adrb2

Ace Cacna1d

Ren

Il10

Ren Gja1

Il6

12 Ace

Ren

Tnf Il1a

8

Agtrap

Ccl5 Il1a

4 4

Agt

Agt

4

6 8 10 12 Peak time (wk)

E

14

similar dynamics

p/v

4

6 8 10 12 14 Valley time (wk)

F p/v

similar dynamics

p/v

v/p

Fig S14. Dynamics comparison for autonomic dysfunction and control phenotypes. Genes are shown that exhibit (A) peaks in both phenotypes, (B) peaks in autonomic dysfunction but valleys for the control phenotype, (C) valleys for autonomic dysfunction but peaks for the control phenotype, and (D) valleys for both phenotypes. Straight black lines correspond to the unity line. (E) Conceptual overview of the profiles observed in panel (A, peaks on both axes) and panel (D, valleys on both axes). The top left quadrant of panel (E) shows two sets of profiles: in the first, the control profile shows an early peak while the disease profile shows a late peak; in the second, the control shows an early valley and the disease profile shows a late valley. Respectively, these two profiles in the upper left quadrant of panel (E) correspond to the upper left quadrants of panels (A) and (D). These sets of profiles correspond to preserved waveforms but temporal shifts between the expression in control versus disease phenotypes. Panel (F) can be interpreted as for panel (E). Each quadrant of (F) exhibits pairs of dynamic profiles corresponding to either panel (B, top pair) or (C, bottom pair). The extreme off-diagonal profiles depict instances in which the dynamics patterns are inverted for disease relative to control. PLOS

16/21

functional process

Control

Brainstem

Inflam ANS RAS

Fig S15. Control brainstem network. This representation is shown for comparison with main text Figure 7A.

PLOS

17/21

Brainstem

functional process

Autonomic dysfunction

Inflam ANS RAS

Fig S16. Autonomic dysfunction brainstem feedforward motifs. All three node feedforward motifs were identified by motif analysis.

PLOS

18/21

Autonomic dysfunction

Expression profile

A

Control

1

1

0.5

0.5

0

0 4

8 12 Age (wk)

16

functional process

B

Inflam ANS RAS

upregulation

8 12 Age (wk)

4

downregulation

16

Fig S17. Example three node network with inconsistent kinetics. (A) Network motif and (B) simulation traces.

500 bp

SNV (TSS - 1.4 kb) TFBS

SNV (TSS - 1.0 kb)

Adrb1 TSS 2

Bits

Bits

2

Ebf1 TFBS motif

1

TGGTGCATACAG TCCCCTGGG CAC CGGTGCATACAG TCCCCTGGG CAC

SNV (TSS - 0.7 kb) Agt

SNV (TSS - 1.0 kb)

2

Il1b

2

Bits

Bits

Control Autonomic dysfunction

TFBS

TFBS

Ebf1 TFBS motif

1

0

Tfap2a TFBS motif

1

0

TTG TCCCCTGTG AGAAGTCTGGGAGATGAAGGCCTCACGGTCAT Control TTG TCCCCTGTG AGAAGTCTGGGAGATGAAGGCCTCACGGTCAG Autonomic dysfunction

TGGTGCATA CAGTCCCCTGGGCA CAC CGGTGCATA CAGTCCCCTGGGCA CAC

SNV (TSS - 0.3 kb)

SNV (TSS - 0.7 kb) Agt TFBS

Control Autonomic dysfunction

Tnf

TFBS 2

Bits

2

Bits

Ebf1 TFBS motif

1

0

0

CAT TCCCAGGGG GCAGATCAGAGGACCACAGCCTGAGG Control CAT TCCCAGGGG GCAGATCAGAGGACCACAGCCTGAGA Autonomic dysfunction

1

Il1b

TFBS

Ybx1 TFBS motif

Ybx1 TFBS motif

0

0

TCTCAGGAGATGGGTAGTGACCAGA GAGACTGGCCAT GCG GCTCAGGAGATGGGTAGTGACCAGA GAGACTGGCCAT GCG

1

Control Autonomic dysfunction

TTC CTGATTGGCCCC GGA Control TTC CTGATTGGCCCC AGA Autonomic dysfunction

Fig S18. Autonomic dysfunction-specific SNVs in regulatory regions. Motif signatures for transcription factors and spatial proximities between TFBSs, TSSs, and SNVs.

PLOS

19/21

Supplementary Table Table 1. Primer sequences

PLOS

Gene

ID

Sequence (5’ - 3’)

Ace

NM 012544

Actb

NM 031144

Adra1a

NM 017191

Adra1b

NM 016991

Adrb1

NM 012701

Adrb2

NM 012492

Agt

NM 134432

Agtr1a

NM 030985

Agtrap

NM 001007654

Cacna1d

NM 017298

Ccl5

NM 031116

Cxcr3

NM 053415

Dbh

NM 013158

Gja1

NM 012567

Hmgb1

NM 012963

Il1a

NM 017019

Il1b

NM 031512

Il6

NM 012589

Il10

NM 012854

Ren

NM 012642

Tgfb1

NM 021578

Th

NM 012740

Tnf

NM 012675

f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r:

GACAACTATCCAGAGGGAATTGA CACAACACCTTGGCTGTCC CTGGCTCCTAGCACCATGA TAGAGCCACCAATCCACACA CACTTCTCAGTGAGGCTGCT AGGCTTGAAATCCGGGAAGA TTCTTCATCGCTCTCCCGCT GCTGTTGAAGTAGCCCAGCC AGACGTGCTATGTGTGACGG CTCTGGTAGCGAAAGGGCAG GTGGATTGTGTCGGGCCTTA GCGATAGCATAGGCCTGGTT CACCTACGTTCACTTCCAAGG AGAACTCATGGAGCCCAGTC CTCTCTCAGCTCTGCCACATTC TTCGAAATCCACTTGACCTGGTG ATTGGCATGTTTCTTGGTGGC CAACGGCAACGCTTGAGTAG GGCAGAAGACATAGATCCTGAGA ACTGGTGGGCATGCTAGTGT GTGCCCACGTGAAGGAGTAT TCGAGTGACAAAGACGACTGC TAGATGCCTCGGACATTGCC AGGAGGCTGTAGAGGACTGG ACTACTGTCGCCACGTGCT ACCGGCTTCTTCTGGGTAGT ACTTCAGCCTCCAAGGAGTTC CATGTCTGGGCACCTCTCTTT GGCGGCTGTTTTGTTGACAT ACCCAAAATGGGCAAAAGCA AGGATCGTCAAGCAGGAGTT TTTAGAGTCGTCTCCTCCCGA AGGCTGACAGACCCCAAAAG CTCCACGGGCAAGACATAGG TCTGGTCTTCTGGAGTTCCG AGCATTGGAAGTTGGGGTAGG TTGAACCACCCGGCATCTAC CCAAGGAGTTGCTCCCGTTA GCCAGCTTTGGACGAATCTT CCCCATTCAGCACTGATCCT TGGAAAGGGCTCAACACCTG AGAAGTTGGCATGGTAGCCC GCCTGTGTACTTTGTGTCCGAGAG TACGAGAGGCATAGTTCCTGAGC GTCGTAGCAAACCACCAAGC TGTGGGTGAGGAGCACATAG

20/21

References 1. G´ omez HF, Hucka M, Keating SM, Nudelman G, Iber D, Sealfon SC. MOCCASIN: converting MATLAB ODE models to SBML. Bioinformatics (Oxford, England). 2016;32(12):1905–1906.

PLOS

21/21

Suggest Documents