Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks Karthik Raman Department of Biotechnology
Indian Institute of Technology Madras https://home.iitm.ac.in/kraman/lab/
2015 NNMCB National Meeting December 27, 2015
Introduction
Fast-SL
Results
Conclusions
Genome-Scale Metabolic Networks (GSMNs)
▶
GSMNs account for the functions of all the known metabolic genes in an organism
▶
Constructed primarily from the genome sequence with annotations from enzyme and pathway databases
▶
100+ GSMNs are presently available
Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
1 / 24
Introduction
Fast-SL
Results
Conclusions
What can GSMNs tell us? McCloskey D et al (2013) Molecular Systems Biology 9:661–661
A
g
rin ee gin en ies olic stud b eta 68 7.4% 2
A
M
B
Int
ers
F
No growth
E E. coli
∆ gene A=0 B = 6.7
Wild type A = 3.8 B = 2.9
No growth
M
Growth
Design A
B
pe cie 7 s s Int er tu 2.8 dies actio n %
M
E
B
M. barkeri
orf3
Model-driven discovery 18 studies 7.3%
E. coli OD
t
E. E. coli coli Reconstruction reconstruction 248total Total studies studies 248
Loss of redundant pathways
CAATCGACAG TGATAGCCAG TTAGTCTGAG T
Active pathways
orf2
B. aphidicola
Flux coupling
Coupled reaction sets
s
Pre
rtie
dic 25 tio n o 64 s .8% t fc ellu udie lar s ph en o
typ es
Karthik Raman
Studies of evolutionary processes 19 studies 7.7%
?
orf1
E tes yo ar ok Pr
Mutualistic growth
C
D aly An
sis
e rop .0% s 29 udie ork p st tw 72 al ne gic iolo fb
o
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
2 / 24
Introduction
Fast-SL
Results
Conclusions
What can GSMNs tell us?
▶
Predict potential drug targets, by identifying essential and synthetic lethal genes Editor’s Choice
Identification of potential drug targets in Salmonella enterica sv. Typhimurium using metabolic modelling and experimental validation Hassan B. Hartman,1 David A. Fell,1 Sergio Rossell,23 Peter Ruhdal Jensen,2 Martin J. Woodward,3 Lotte Thorndahl,4 Lotte Jelsbak,4 John Elmerdahl Olsen,4 Anu Raghunathan,54 Simon Daefler5 and Mark G. Poolman1 1
Correspondence
Department of Medical and Biological Sciences, Oxford Brookes University, Gipsy Lane, Headington, Oxford OX3 OBP, UK
Mark G. Poolman
[email protected]
2
Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
3
Department of Food and Nutritional Sciences, University of Reading, Reading, UK
4
Department of Veterinary Disease Biology, University of Copenhagen, Copenhagen, Denmark
5
Department of Infectious Diseases, Mount Sinai School of Medicine, New York, NY, USA
Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
3 / 24
Introduction
Fast-SL
Results
Conclusions
What are Synthetic Lethals? Synthetic lethal gene (or reaction) sets are sets of genes where only the simultaneous removal of all genes in the set abolishes growth:
Gene abc
Gene abc
Gene pqr
Gene pqr
Wild-type
Δpqr
Gene abc
Gene abc
Gene pqr
Gene pqr
Δabc
ΔabcΔpqr
The concept of synthetic lethality can be extended to higher orders, e.g. triplets Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
4 / 24
Introduction
Fast-SL
Results
Conclusions
Why Identify Synthetic Lethals?
▶
Synthetic lethals find applications in ▶
Understanding gene function and functional associations¹
▶
Combinatorial drug targets against pathogens²
▶
Cancer therapy³
¹Ooi SLL et al (2006) Trends Genet 22:56–63 ²Hsu KC et al (2013) PLoS Comput Biol 9:e1003127+ ³Kaelin WG (2005) Nat Rev Cancer 5:689–698 Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
5 / 24
Introduction
Fast-SL
Results
Conclusions
How to Identify Synthetic Lethals?
▶
Yeast synthetic lethals have been identified experimentally using yeast synthetic genetic arrays¹, ²
▶
Previous in silico approaches have built on the framework of Flux Balance Analysis — restricted to metabolic genes
¹Tong AHY et al (2001) Science 294:2364–2368 ²Tong AHY et al (2004) Science 303:808–813 Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
6 / 24
Introduction
Fast-SL
Results
Conclusions
What is Flux Balance Analysis? ▶
Effective constraint-based method to study genome-scale metabolic networks¹
▶
The mass balance constraints in system of reactions can be represented by a system of linear equations involving reaction fluxes at steady state
▶
The system is under-determined — so we compute the flux distribution that maximises biomass: mathematically, this is a linear programming problem max vbio
(the biomass flux)
s.t.
Σj sij vj = 0
∀i ∈ M (set of metabolites)
LBj ≤ vj ≤ UBj
∀j ∈ J (set of reactions)
¹Varma A & Palsson BO (1994) Applied and Environmental Microbiology 60:3724–3731 Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
7 / 24
Introduction
Fast-SL
Results
Conclusions
Geometrical interpretation of FBA Orth JD et al (2010) Nature Biotechnology 28:245–248
v3
×
v3
v1
v1
participating coefficient
Unconstrained solution space v2
Karthik Raman
v3 Optimization maximize Z
Constraints 1) Sv = 0 2) a i < v i < b i
v1
Allowable solution space v2
Optimal solution v2
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
8 / 24
Introduction
Fast-SL
Results
Conclusions
Flux Balance Analysis ▶
FBA has been proven to accurately predict phenotypes following various genetic perturbations¹, ² ▶
To delete reaction k, set vk = 0 and repeat the simulation: max vbio s.t. LBj ≤ vj ≤ UBj
∀i ∈ M ∀j ∈ J
vd = 0
d∈D∈J
Σj sij vj = 0
▶
FBA can also reliably predict synthetic lethal genes in metabolic networks of organisms such as yeast³
¹Edwards JS & Palsson BO (2000) BMC Bioinformatics 1:1 ²Famili I et al (2003) PNAS 100:13134–13139 ³Harrison R et al (2007) PNAS 104:2307–2312 Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
9 / 24
Introduction
Fast-SL
Results
Conclusions
Identifying Synthetic Lethals Brute Force/Exhaustive Enumeration
▶
Single lethals are easier to identify ▶
▶
Solve one optimisation problem for each gene deletion (genotype)
Synthetic lethals are more difficult to identify ▶ ▶ ▶ ▶
Combinatorial Explosion ( ) e.g. 1000 ≈ 170 million simulations! 3 Quickly becomes infeasible for larger organisms … However, simulations are independent and can be easily parallelised on a computer cluster¹
¹Deutscher D et al (2006) Nature Genetics 38:993–8 Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
10 / 24
Introduction
Fast-SL
Results
Conclusions
Identifying Synthetic Lethals Bi-Level Mixed Integer Linear Programming Problem
▶
SL-Finder¹ poses the synthetic lethal identification problem elegantly as a bi-level MILP
▶
Synthetic lethal double and triple reaction deletions have been reported for E. coli
▶
However, the MILP problems become incrementally difficult to solve
▶
Time taken, on a workstation, was ≈ 6.75 days, for E. coli iAF1260 model
▶
MCSEnumerator is another MILP-based method, which runs even faster²
¹Suthers PF et al (2009) Molecular Systems Biology 5:301 ²von Kamp A & Klamt S (2014) PLoS Computational Biology 10:e1003378 Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
11 / 24
Is there a way to surmount the complexity of exhaustive enumeration and bi-level MILP?
Introduction
Fast-SL
Results
Conclusions
An Alternate Approach: Fast-SL Pratapa A et al (2015) Bioinformatics 31:3299–3305
▶
Heavily prunes search space for synthetic lethals, and
▶
Exhaustively iterates through remaining (much fewer) combinations We successively compute:
▶
▶ ▶ ▶
▶
Karthik Raman
Jsl , the set of single lethal reactions, Jdl ⊂ J × J, the set of synthetic lethal reaction pairs, and Jtl ⊂ J3 , the set of synthetic lethal reaction triplets
Central idea: We use FBA to compute a flux distribution, corresponding to maximum growth rate, while minimising the sum of absolute values of the fluxes, i.e. the ℓ1 -norm of the flux vector — the ‘minimal norm’ solution of the FBA LP problem
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
12 / 24
Introduction
Fast-SL
Results
Conclusions
Fast-SL: Eliminating Non-Lethal Sets max vbio
(1)
s.t. S.v = 0
(2)
LBj ≤ vj ≤ UBj
∀j ∈ J
(3)
▶
Identify a flux distribution which obeys the constraints of FBA(2),(3) and also sustains maximum growth(1) (sparse!)
▶
The set of reactions that carry a non-zero flux in this solution is Jnz
▶
How does this help?
Karthik Raman
�
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
� ���
13 / 24
Introduction
Fast-SL
Results
Conclusions
Fast-SL Massively Prunes Search Space for Synthetic Lethals ▶
�
If a reaction j carries zero flux in the minimal norm solution (j ∈ / Jnz ), which is constrained to support growth, it cannot be lethal
���
⇒ There is no single lethal reaction outside Jnz ⇒ The set of all single lethals (Jsl ) is contained entirely in Jnz
J-Jnz
���
All synthetic lethal pairs lie in the narrow ‘red region’ of J × J (drawn to scale for E. coli)
Karthik Raman
J
If a pair of reactions i, j carry zero flux in the minimal norm solution (i, j ∈ / Jnz ), they cannot be a synthetic lethal pair
⇒ There are no synthetic lethal pairs that comprise reactions that are both not in Jnz ▶
Jnz
J
▶
���
Jsl
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
14 / 24
Introduction
Fast-SL
Results
Conclusions
Fast-SL Achieves Massive Speedups Jnz
Even in the narrow red region, further gains are made by re-applying the idea
▶
The gains are even more substantial for higher order lethals:
J
▶
J-Jnz
Order
Exhaustive LPs
Single Double Triple Quadruple
2.05 × 103 1.57 × 106 9.27 × 108 4.10 × 1011
Karthik Raman
Jsl
LPs solved after eliminating non-lethal sets 393 7, 779 432, 487 4.53 × 107
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
Reduction in search-space ≈ 5 fold ≈ 200 fold ≈ 2100 fold ≈ 9050 fold
15 / 24
Introduction
Fast-SL
Results
Conclusions
Fast-SL: Minimum Norm Solution ▶ ▶
Smaller the set of non-zero reactions, Jnz , lesser the number of LPs to be solved for identifying lethal sets Minimised ℓ0 -norm solution of the FBA LP problem finds the sparsest solution ▶
▶
However, it requires solving an MILP problem
We use the ℓ1 -norm solution instead
min. Σj |vj | s.t.
Σj sij vj = 0
∀i ∈ M
LBj ≤ vj ≤ UBj
∀j ∈ J
vbio = vbio,max Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
16 / 24
Introduction
Fast-SL
Results
Conclusions
Fast-SL Achieves 4x Speedup over MCSEnumerator ▶
Fast-SL can also be parallelised, leading to further speed-ups
▶
Fast-SL achieves ≈ 4x speed-up over the MCSEnumerator method¹ for the E. coli iAF120 model for higher order reaction deletions
▶
Results obtained using Fast-SL match precisely with exhaustive enumeration of gene deletions
▶
Similar approach can be used to identify lethal gene sets by incorporating gene–reaction rules
Order of SLs Single Double Triple Quadruple
No. of SLs 278 96 247 402
CPU time taken for MCSEnumerator (using 12 cores) 11 s 39.1 s 16.8 min 18.5 h
CPU time taken for Fast-SL Algorithm (using 6 cores) 2.8 s 17.2 s 8.5 min 9.3 h
Speed-up
≈ 8x ≈ 4x ≈ 4x ≈ 4x
¹von Kamp A & Klamt S (2014) PLoS Computational Biology 10:e1003378 Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
17 / 24
Introduction
Fast-SL
Results
Conclusions
Synthetic Lethal Gene Deletions ▶
Most previous algorithms only computed synthetic reaction deletions
▶
Not easily modified for computing gene deletions
▶
We extended our algorithm to gene deletions by using the gene–reaction mapping
▶
Fast-SL formulation identified 75 new gene triplets in E. coli that were not identified previously
▶
We have also identified up to synthetic lethal gene and reaction quadruplets for other pathogenic organisms such as Salmonella Typhimurium, Mycobacterium tuberculosis, Staphylococcus aureus and Neisseria meningitidis
Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
18 / 24
Introduction
Fast-SL
Results
Conclusions
Missing Biomass Precursors in E. coli ▶ ▶
Gene/reaction lethality is a result of organism’s inability to produce any of the biomass precursors Most triple and quadruple gene deletions affect mechanisms involved in ATP production 50% 40%
30% 20% 10% 0%
Reiterates critical role played by co-factors and ATP in cellular metabolism! Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
19 / 24
Introduction
Fast-SL
Results
Conclusions
Synthetic Lethals Illustrate Complex Metabolic Dependencies
▶
atpB, cydA, gap ▶
▶
▶
eno, pps, sdhA/B/C ▶ ▶
Karthik Raman
ATP synthase, cytochrome D ubiquinol oxidase and glyceraldehyde 3-phosphate dehydrogenase Perhaps bring about their effect by disabling both substrate-level and oxidative phosphorylation Enolase, PEP synthase and succinate dehydrogenase subunits Seem to bring about their effect by affecting production of phosphoenolpyruvate and consequently disabling OXPHOS
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
20 / 24
Introduction
Fast-SL
Results
Conclusions
Combinatorial Drug Targets
▶
Only few combinatorial deletions abolish growth in silico
▶
Re-emphasises the robust nature of the metabolic networks in both M. tuberculosis and S. Typhimurium
▶
28 triplets and 20 doublets in M. tuberculosis have no homologues in human
▶
21 triplets and 39 doublets in S. typhimurium have no homologues
▶
Some of these may be interesting drug targets
Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
21 / 24
Introduction
Fast-SL
Results
Conclusions
Limitations
▶
Metabolic models considered here do not account for regulation or other functions of proteins
▶
The method can identify synthetic lethals only in metabolism Any inadequacies/gaps in the metabolic model will affect the results, e.g. some isozymes may not have been characterised yet
▶
▶
Karthik Raman
Lethality results can be useful to refine the metabolic model
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
22 / 24
Introduction
Fast-SL
Results
Conclusions
Summary ▶
Synthetic lethals are difficult to identify computationally — combinatorial explosion of possibilities
▶
Previous approaches have used FBA to exhaustively search the entire space, or pose the problem as a bi-level MILP Our algorithm, Fast-SL, circumvents the complexities of previous approaches, through a massive reduction of search space, exploiting the minimal norm solution of FBA
▶
▶
For E. coli, the reduction in search space is ≈ 4000-fold for synthetic lethal triplets!
▶
Ours is also the first method that systematically evaluates gene deletions
▶
Our results agree exactly with exhaustive enumeration
▶
Fast-SL finds application in identifying functional associations and combinatorial drug targets
Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
23 / 24
Introduction
Fast-SL
Results
Conclusions
Acknowledgments
▶
Aditya Pratapa
▶
Dr. Shankar Balachandran
▶
High Performance Computing Facility IIT Madras
▶
Funding: Department of Biotechnology, Government of India; IIT Madras; nVidia
Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
24 / 24
Introduction
Fast-SL
Results
Conclusions
Thank you! MATLAB implementation of Fast-SL is available for download from: https://github.com/RamanLab/FastSL
Karthik Raman
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks
24 / 24