A data-driven modeling approach to identify disease-specific multi-organ networks driving physiological dysregulation Warren D. Anderson1Y¤ , Danielle DeCicco1Y , James S. Schwaber1 , Rajanikanth Vadigepalli1* 1 Daniel Baugh Institute for Functional Genomics and Computational Biology, Department of Pathology, Anatomy, and Cell Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA YThese authors contributed equally to this work. ¤Current Address: Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA * Corresponding author:
[email protected]
Supporting Information • Supplementary Files • Supplementary Figures • Supplementary Table
PLOS
1/21
Supplementary Files S1 File values).
S1 File.csv : This file contains un-normalized raw gene expression data (Ct
S2 File
S2 File.csv : This file contains normalized gene expression data.
S3 File S3 File.pdf : This file contains normalized gene expression data plotted along with model simulation traces. S4 File S4 File.xlsx : This file contains names for parameters and dynamic variables. As an example to describe out parameter label convention, the interaction coefficient denoting the directed influence in which gene g2 from organ r2 regulates gene g1 in organ r1 is labled k r1g1 r2g2 (i.e., k to from). Initial conditions are included in another tab. SHR denotes the spontaneously hypertensive rat (autonomic dysfunction) and WKY denotes the Wistar Kyoto control phenotype. S5 File S5 File.xml : This file contains the dynamic model for the autonomic dysfunction phenotype in the systems biology markup language (SBML) format. The model was converted from Matlab to SBML using MOCCASIN [1]. S6 File S6 File.xml : This file contains the dynamic model for the control phenotype in the systems biology markup language (SBML) format. The model was converted from Matlab to SBML using MOCCASIN [1]. S7 File S7 File.mat: This file contains the parameter values and initial conditions, along with some other basic information for simulating the autonomic dysfunction and control models in matlab. S8 File
S8 File.m: This file contains matlab simulation code.
S9 File S9 File.RData: This file contains the parameter values and initial conditions for simulating the autonomic dysfunction and control models in R. S10 File
PLOS
S10 File.R: This file contains R simulation code.
2/21
Supplementary Figures B
0
2
4
6
8
Count
10
12
14
A
2
4
6
8
10
12
Stability rank for median
C
PC2 (17%)
10
● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ●●
-5
-10
Adrenal
● ●
E
Ventricle
5
● ●
0
● ● ●● ● ● ● ● ●
● ●●● ●● ● ●● ●● ● ●● ● ● ●● ● ● ●●
Ace Il1a Agt 0
● ●
−5
● ●
●
● ●
0
5
5
●
● ●
● ● ● ● ●
● ● ● ● ● ● ● ● ●
●
●
15
1
●
5
10
● ● ●
15
● ● ● ●
● ●
● ● ● ● ● ●
● ● ● ● ●
● ● ●
● ● ● ●
●
● ● ●
● ● ● ●
● ● ● ● ●
● ● ●
● ● ●
● ●
−2
●
● ● ● ●
0
2
10
2 0
● ● ●
●
●
● ●
● ●
−6
Agtrap Ren Il1b
●
●
● ● ●
● ● ●
● ●
−2
●● ● ●●● ● ●
Cxcr3 Ace Il1a
●
2
● ● ●
● ●
−2
●
●
●
0
2 1 0
● ● ● ●
●
−2
5
● ● ● ● ●
●
−4
● ● ● ● ● ●
●● ● ●● ● ● ●●● ● ● ● ●●● ●● ●● ●● ● ● ● ● ● ● ●
●
−6
Kidney ● ● ● ●
Ventricle ● ●
Il1a expression (−ddCt)
5
● ● ● ● ● ●
5
●
0
0
0
PC1 (22%)
PC2 (17%)
Adra1a Cxcr3 Ccl5
−5
Hmgb1 Agtrap
PC1 (31%)
PC2 (21%)
Agtrap Cacna1d −5
● ● ● ●● ● ● ● ● ● ●● ● ● Gja1
−5
−5
● ● ● ●
● ● ●● ●● ● ● ● ● ● ●● ● ●
−5
●● ● ●● ●● ● ● ●● ●● ● ● ●● ● ● Ren ● ● ● ● ● ● ● ● ● ● Adra1a ● ● ● ● ● Ccl5 ● ● ● ● ● ● ● Agtr1
●
3
4
4
Adrenal
3
5
5 0
● ● ● ● ● ●
−5
PC2 (24%)
Brainstem ● ●
0
5
PC2 (19%)
0
PC1 (40%)
−5
−5
●● ●
10
−4
● ● ● ● ● ●
● ●
Liver
●
5
0
PC1 (20%)
●
Hmgb1 expression (−ddCt)
5 0
0 -5
-10 Hmgb1 Tgfb1 Agtrap Agtr1 Agt ● ● Ren ●
−5
PC2 (21%)
D
5
●
5
10
Age (wk)
15
● ●
5
10
Age (wk)
15
PC1 (28%)
Adra1a Adrb2 Ren 5
PC1 (23%)
Fig S1. Sampling, normalization, and outlier evaluation. (A) Table detailing the amimal sampling and organs utilized in our analysis for each animal. (B) Stability ranks of median expression values were considered for each organ/age combination. For the majority of organ/time combinations, the median expression level was ranked among the most stable (≤ 12/22), in comparison with the stability levels for individual genes. (C) PCA was applied to the entire data set (all genes/organs) and plotted along with the variability accounted for by the first two PCs. The smooth circle shows the 99% confidence interval for the mean of a bi-variate Gaussian distribution characterized by the displayed data. Note that this interval contains the majority of the data, and the few value outside of this interval are in close proximity. (D) PCA was implemented separately for each organ. Specific color refer to the same animals in all plots. For instance, the three gray dots in the Adrenal PCA plot refer to three animals that are relatively distant from the other animal samples in this analysis. However, observation of the PC projections of these specific animals in the PCAs applied to the data from other organs shows that these animal samples are not imposing consistent biases. Panel (E) shows sample expression data labeled as in (D) for animal samples marked in the Adrenal and Ventricle PCAs.
PLOS
3/21
25 20 15 10 0
5
Log (Jsim)
30
35
α=0 α = 0.2 α = 0.4 α = 0.6 α = 0.8 α=1
Autonomic dysfunction
−10
−5
0
5
−5
0
5
Log (λ)
Control
25 20 15 10 0
5
Log (Jsim)
30
35
−15
−15
−10
Log (λ)
Fig S2. Robustness of regularized regression-based system identification. Error between simulated gene expression levels and experimentally measured mean expression values varies minimally with respect to regularization parameters. Log error is plotted with respect to the log λ value for a range of α levels.
PLOS
4/21
Kbest < 0 and Kcomp > 0
Kbest > 0 and Kcomp > 0
Kbest < 0 and Kcomp < 0
Kbest > 0 and Kcomp < 0
Kbest < 0 and K comp > 0 Kbest > 0 and K comp > 0 Odds ratio = Kbest < 0 and K comp < 0 Kbest > 0 and K comp < 0
Fig S3. Evaluation of sign consistency of interaction coefficiencts across multiple iterations of system identification. The equation illustrates the computation of the odds ratio based on the contingency table.
PLOS
5/21
100 0
50
Count
150
200
Added
−0.2
0.0
ΔE
0.1
0.2
100 0
50
Count
150
200
Removed
−0.1
−0.2
0.0
ΔE
0.1
0.2
0.3
0
5
Count
10
15
Switched
−0.1
−0.4
−0.2
0.0
ΔE
0.2
0.4
Fig S4. Differential network analysis of changes in gene-gene interactions in autonomic dysfunction. Black bars correspond to edges considered to be differentially regulated in autonomic dysfunction.
PLOS
6/21
- log q 1
2
3 Il1b Il1b Tgfb1 Tgfb1 Il1b Il1a Il1a Il6Tgfb1 Il6 Il1a Il10 Il10 Il6 Ccl5 Ccl5 Il10 Hmgb1 Hmgb1 TnfCcl5 Tnf Hmgb1 Cxcr3 Cxcr3 ThTnf Th Cxcr3 Dbh Dbh Th Adrb2 Adrb2 Dbh Adrb1 Adrb1 Adrb2 Adra1a Adra1a Adrb1 Adra2a Adra2a Adra1a Cacna1d Cacna1d Adra2a Gja1 Gja1 Cacna1d Ace Ace Gja1 Agt Agt Ace Agtr1 Agtr1 Agt Ren Ren Agtr1 Agtrap
Adrenal
Brainstem
Kidney
Liver
Ventricle
Agtrap Ren A t
Fig S5. Timeseries analysis of gene expression dynamics. Many genes showed significantly different expression patterns between autonomic dysfunction and control phenotypes (q < 0.1, -log q > 1).
PLOS
7/21
60 40 0
10
20
30
Count
50
60 50 40 30
Count
20 10 0 0.6
0.7
0.8
0.9
1.0
0.6
0.70 0.75 0.80 0.85 0.90 0.95 1.00 −15
−10
−5
0
0.9
1.0
Control
log(λ)
α=0 α = 0.2
0.8
0.70 0.75 0.80 0.85 0.90 0.95 1.00
Autonomic dysfunction
B
0.7
Spearman rank correlation
Spearman rank correlation
Spearman rank correlation
Spearman rank correlation
Control
70
Autonomic dysfunction
70
A
−15
−10
−5
0
log(λ)
α = 0.4 α = 0.6
α = 0.8 α=1
Fig S6. Correlational analysis of system identification robustness. High correlations (> 0.7) between identified networks were observed over an expansive range of regularization parameter space. (A) Spearman rank correlation coefficient histogram and (B) Correlation values as a function of regularization parameter values for λ and α.
PLOS
8/21
α=0 α = 0.2 Autonomic dysfunction
3.0 2.5 2.0 1.0
−5
0
−15
−10
−5
0
−5
0
−5
0
0.5 0.4 0.3 0.2 0.0
0.1
0.1
0.2
0.3
0.4
Clustering coefficient
0.5
log(λ)
0.0 −15
−10
−5
0
−15
−10
15 10 0
0
5
10
15
Power law exponent
20
log(λ)
20
log(λ)
5
Clustering coefficient
−10
log(λ)
B
Power law exponent
1.5
Average path length
2.5 2.0 1.5 1.0 −15
C
α = 0.8 α=1 Control
3.0
A Average path length
α = 0.4 α = 0.6
−15
−10
log(λ)
−5
0
−15
−10
log(λ)
Fig S7. Graph theoretic analysis of network identification robustness. (A) Path length, (B) clustering coefficients, and (C) power law exponents are shown for a range of regularization parameters.
PLOS
9/21
A
Autonomic dysfunction
Control organ Adrenal Brainstem Kidney Liver Ventricle
B
Autonomic dysfunction
Control
Fig S8. Graphical representations of network interactions. (A) Phenotype-specific multi-organ networks. (B) Subnetworks including interactions between the brainstem and adrenal gland.
PLOS
10/21
Autonomic dysfunction
organ
functional process
Adrenal Brainstem Inflam Kidney ANS RAS Liver Ventricle
Fig S9. Network illustrating influences of the brainstem on the other organs in the autonomic dysfunction phenotype. Note that the nodes are organized as in Fig S10 for comparison.
PLOS
11/21
Control
organ
functional process
Adrenal Brainstem Inflam Kidney ANS RAS Liver Ventricle
Fig S10. Network illustrating influences of the brainstem on the other organs in the control phenotype. Note that the nodes are organized as in Fig S9 for comparison.
PLOS
12/21
Autonomic dysfunction
Expression profile
4
organ functional Adrenal process Brainstem Inflam Kidney ANS RAS Liver Ventricle
8
Age (wk) 12
16 gene 4
Control 8
Age (wk) 12
16
Cxcr3 Cxcr3
Cxcr3
Cxcr3
Normalized expression 0
1
Fig S11. Organized sequence of gene expression valleys in autonomic dysfunction. Expression profiles were organized according to the sequence of valleys observed for the autonomic dysfunction phenotype (left).
PLOS
13/21
Autonomic dysfunction
Control 4
8
Age (wk) 12
16 gene
4
8
Age (wk) 12
16
Expression profile
Cxcr3
organ functional Adrenal process Brainstem Inflam Kidney ANS RAS Liver Ventricle
Cxcr3
Cxcr3
Normalized expression 0
1
Fig S12. Organized sequence of gene expression peaks is disrupted in the autonomic dysfunction phenotype. Expression profiles were organized according to the sequence of peaks observed for the control phenotype (left).
PLOS
14/21
Control 8
Age (wk) 12
16 gene
4
8
Age (wk) 12
16
Expression profile
4
Autonomic dysfunction
Normalized expression organ functional Adrenal process Brainstem Inflam Kidney ANS RAS Liver Ventricle
0
1
Fig S13. Organized sequence of gene expression valleys is disrupted in the autonomic dysfunction phenotyppe. Expression profiles were organized according to the sequence of valleys observed for the control phenotype (left).
PLOS
15/21
Adrenal gland Brainstem
Peak time (wk)
B 16
Autonomic dysfunction Control Liver Left ventricle
12
Il1b Dbh
8
Adrb2
Kidney
Agtrap
Agtr1
Gja1 Gja1
Th
Peak time (wk)
A 16
Il1a Gja1
12 Adrb1 Agtr1
8
Cacna1d
Ace Gja1 Il1bCacna1d Agt DbhTh Ren
Adra1a
Cacna1d Il10 Il10 Ccl5 Tgfb1 Dbh Il1b
Adrb1
4
Adra1b Agt Il1a
6 8 10 12 Peak time (wk)
C 16
14
4
Agtr1 Adra1b
Th
4 4
Il1a Hmgb1
6 8 10 12 14 Valley time (wk)
D 16
Agtr1 Adra1a
Dbh Tnf
12
Ace
Valley time (wk)
Valley time (wk)
Adra1b Adrb1
Ace
8
Il1b Adra1bCxcr3 Agtrap Adrb2 Tgfb1 Adrb1 Cxcr3 Tnf Agtr1Cxcr3 Il10 Il10
Adrb2
Ace Cacna1d
Ren
Il10
Ren Gja1
Il6
12 Ace
Ren
Tnf Il1a
8
Agtrap
Ccl5 Il1a
4 4
Agt
Agt
4
6 8 10 12 Peak time (wk)
E
14
similar dynamics
p/v
4
6 8 10 12 14 Valley time (wk)
F p/v
similar dynamics
p/v
v/p
Fig S14. Dynamics comparison for autonomic dysfunction and control phenotypes. Genes are shown that exhibit (A) peaks in both phenotypes, (B) peaks in autonomic dysfunction but valleys for the control phenotype, (C) valleys for autonomic dysfunction but peaks for the control phenotype, and (D) valleys for both phenotypes. Straight black lines correspond to the unity line. (E) Conceptual overview of the profiles observed in panel (A, peaks on both axes) and panel (D, valleys on both axes). The top left quadrant of panel (E) shows two sets of profiles: in the first, the control profile shows an early peak while the disease profile shows a late peak; in the second, the control shows an early valley and the disease profile shows a late valley. Respectively, these two profiles in the upper left quadrant of panel (E) correspond to the upper left quadrants of panels (A) and (D). These sets of profiles correspond to preserved waveforms but temporal shifts between the expression in control versus disease phenotypes. Panel (F) can be interpreted as for panel (E). Each quadrant of (F) exhibits pairs of dynamic profiles corresponding to either panel (B, top pair) or (C, bottom pair). The extreme off-diagonal profiles depict instances in which the dynamics patterns are inverted for disease relative to control. PLOS
16/21
functional process
Control
Brainstem
Inflam ANS RAS
Fig S15. Control brainstem network. This representation is shown for comparison with main text Figure 7A.
PLOS
17/21
Brainstem
functional process
Autonomic dysfunction
Inflam ANS RAS
Fig S16. Autonomic dysfunction brainstem feedforward motifs. All three node feedforward motifs were identified by motif analysis.
PLOS
18/21
Autonomic dysfunction
Expression profile
A
Control
1
1
0.5
0.5
0
0 4
8 12 Age (wk)
16
functional process
B
Inflam ANS RAS
upregulation
8 12 Age (wk)
4
downregulation
16
Fig S17. Example three node network with inconsistent kinetics. (A) Network motif and (B) simulation traces.
500 bp
SNV (TSS - 1.4 kb) TFBS
SNV (TSS - 1.0 kb)
Adrb1 TSS 2
Bits
Bits
2
Ebf1 TFBS motif
1
TGGTGCATACAG TCCCCTGGG CAC CGGTGCATACAG TCCCCTGGG CAC
SNV (TSS - 0.7 kb) Agt
SNV (TSS - 1.0 kb)
2
Il1b
2
Bits
Bits
Control Autonomic dysfunction
TFBS
TFBS
Ebf1 TFBS motif
1
0
Tfap2a TFBS motif
1
0
TTG TCCCCTGTG AGAAGTCTGGGAGATGAAGGCCTCACGGTCAT Control TTG TCCCCTGTG AGAAGTCTGGGAGATGAAGGCCTCACGGTCAG Autonomic dysfunction
TGGTGCATA CAGTCCCCTGGGCA CAC CGGTGCATA CAGTCCCCTGGGCA CAC
SNV (TSS - 0.3 kb)
SNV (TSS - 0.7 kb) Agt TFBS
Control Autonomic dysfunction
Tnf
TFBS 2
Bits
2
Bits
Ebf1 TFBS motif
1
0
0
CAT TCCCAGGGG GCAGATCAGAGGACCACAGCCTGAGG Control CAT TCCCAGGGG GCAGATCAGAGGACCACAGCCTGAGA Autonomic dysfunction
1
Il1b
TFBS
Ybx1 TFBS motif
Ybx1 TFBS motif
0
0
TCTCAGGAGATGGGTAGTGACCAGA GAGACTGGCCAT GCG GCTCAGGAGATGGGTAGTGACCAGA GAGACTGGCCAT GCG
1
Control Autonomic dysfunction
TTC CTGATTGGCCCC GGA Control TTC CTGATTGGCCCC AGA Autonomic dysfunction
Fig S18. Autonomic dysfunction-specific SNVs in regulatory regions. Motif signatures for transcription factors and spatial proximities between TFBSs, TSSs, and SNVs.
PLOS
19/21
Supplementary Table Table 1. Primer sequences
PLOS
Gene
ID
Sequence (5’ - 3’)
Ace
NM 012544
Actb
NM 031144
Adra1a
NM 017191
Adra1b
NM 016991
Adrb1
NM 012701
Adrb2
NM 012492
Agt
NM 134432
Agtr1a
NM 030985
Agtrap
NM 001007654
Cacna1d
NM 017298
Ccl5
NM 031116
Cxcr3
NM 053415
Dbh
NM 013158
Gja1
NM 012567
Hmgb1
NM 012963
Il1a
NM 017019
Il1b
NM 031512
Il6
NM 012589
Il10
NM 012854
Ren
NM 012642
Tgfb1
NM 021578
Th
NM 012740
Tnf
NM 012675
f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r: f: r:
GACAACTATCCAGAGGGAATTGA CACAACACCTTGGCTGTCC CTGGCTCCTAGCACCATGA TAGAGCCACCAATCCACACA CACTTCTCAGTGAGGCTGCT AGGCTTGAAATCCGGGAAGA TTCTTCATCGCTCTCCCGCT GCTGTTGAAGTAGCCCAGCC AGACGTGCTATGTGTGACGG CTCTGGTAGCGAAAGGGCAG GTGGATTGTGTCGGGCCTTA GCGATAGCATAGGCCTGGTT CACCTACGTTCACTTCCAAGG AGAACTCATGGAGCCCAGTC CTCTCTCAGCTCTGCCACATTC TTCGAAATCCACTTGACCTGGTG ATTGGCATGTTTCTTGGTGGC CAACGGCAACGCTTGAGTAG GGCAGAAGACATAGATCCTGAGA ACTGGTGGGCATGCTAGTGT GTGCCCACGTGAAGGAGTAT TCGAGTGACAAAGACGACTGC TAGATGCCTCGGACATTGCC AGGAGGCTGTAGAGGACTGG ACTACTGTCGCCACGTGCT ACCGGCTTCTTCTGGGTAGT ACTTCAGCCTCCAAGGAGTTC CATGTCTGGGCACCTCTCTTT GGCGGCTGTTTTGTTGACAT ACCCAAAATGGGCAAAAGCA AGGATCGTCAAGCAGGAGTT TTTAGAGTCGTCTCCTCCCGA AGGCTGACAGACCCCAAAAG CTCCACGGGCAAGACATAGG TCTGGTCTTCTGGAGTTCCG AGCATTGGAAGTTGGGGTAGG TTGAACCACCCGGCATCTAC CCAAGGAGTTGCTCCCGTTA GCCAGCTTTGGACGAATCTT CCCCATTCAGCACTGATCCT TGGAAAGGGCTCAACACCTG AGAAGTTGGCATGGTAGCCC GCCTGTGTACTTTGTGTCCGAGAG TACGAGAGGCATAGTTCCTGAGC GTCGTAGCAAACCACCAAGC TGTGGGTGAGGAGCACATAG
20/21
References 1. G´ omez HF, Hucka M, Keating SM, Nudelman G, Iber D, Sealfon SC. MOCCASIN: converting MATLAB ODE models to SBML. Bioinformatics (Oxford, England). 2016;32(12):1905–1906.
PLOS
21/21