An efficient algorithm to identify synthetic lethals in ...

12 downloads 0 Views 767KB Size Report
Dec 27, 2015 - Synthetic lethal gene (or reaction) sets are sets of genes where only the ... The concept of synthetic lethality can be extended to higher orders,.
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks Karthik Raman Department of Biotechnology

Indian Institute of Technology Madras https://home.iitm.ac.in/kraman/lab/

2015 NNMCB National Meeting December 27, 2015

Introduction

Fast-SL

Results

Conclusions

Genome-Scale Metabolic Networks (GSMNs)



GSMNs account for the functions of all the known metabolic genes in an organism



Constructed primarily from the genome sequence with annotations from enzyme and pathway databases



100+ GSMNs are presently available

Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

1 / 24

Introduction

Fast-SL

Results

Conclusions

What can GSMNs tell us? McCloskey D et al (2013) Molecular Systems Biology 9:661–661

A

g

rin ee gin en ies olic stud b eta 68 7.4% 2

A

M

B

Int

ers

F

No growth

E E. coli

∆ gene A=0 B = 6.7

Wild type A = 3.8 B = 2.9

No growth

M

Growth

Design A

B

pe cie 7 s s Int er tu 2.8 dies actio n %

M

E

B

M. barkeri

orf3

Model-driven discovery 18 studies 7.3%

E. coli OD

t

E. E. coli coli Reconstruction reconstruction 248total Total studies studies 248

Loss of redundant pathways

CAATCGACAG TGATAGCCAG TTAGTCTGAG T

Active pathways

orf2

B. aphidicola

Flux coupling

Coupled reaction sets

s

Pre

rtie

dic 25 tio n o 64 s .8% t fc ellu udie lar s ph en o

typ es

Karthik Raman

Studies of evolutionary processes 19 studies 7.7%

?

orf1

E tes yo ar ok Pr

Mutualistic growth

C

D aly An

sis

e rop .0% s 29 udie ork p st tw 72 al ne gic iolo fb

o

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

2 / 24

Introduction

Fast-SL

Results

Conclusions

What can GSMNs tell us?



Predict potential drug targets, by identifying essential and synthetic lethal genes Editor’s Choice

Identification of potential drug targets in Salmonella enterica sv. Typhimurium using metabolic modelling and experimental validation Hassan B. Hartman,1 David A. Fell,1 Sergio Rossell,23 Peter Ruhdal Jensen,2 Martin J. Woodward,3 Lotte Thorndahl,4 Lotte Jelsbak,4 John Elmerdahl Olsen,4 Anu Raghunathan,54 Simon Daefler5 and Mark G. Poolman1 1

Correspondence

Department of Medical and Biological Sciences, Oxford Brookes University, Gipsy Lane, Headington, Oxford OX3 OBP, UK

Mark G. Poolman [email protected]

2

Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark

3

Department of Food and Nutritional Sciences, University of Reading, Reading, UK

4

Department of Veterinary Disease Biology, University of Copenhagen, Copenhagen, Denmark

5

Department of Infectious Diseases, Mount Sinai School of Medicine, New York, NY, USA

Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

3 / 24

Introduction

Fast-SL

Results

Conclusions

What are Synthetic Lethals? Synthetic lethal gene (or reaction) sets are sets of genes where only the simultaneous removal of all genes in the set abolishes growth:

Gene abc

Gene abc

Gene pqr

Gene pqr

Wild-type

Δpqr

Gene abc

Gene abc

Gene pqr

Gene pqr

Δabc

ΔabcΔpqr

The concept of synthetic lethality can be extended to higher orders, e.g. triplets Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

4 / 24

Introduction

Fast-SL

Results

Conclusions

Why Identify Synthetic Lethals?



Synthetic lethals find applications in ▶

Understanding gene function and functional associations¹



Combinatorial drug targets against pathogens²



Cancer therapy³

¹Ooi SLL et al (2006) Trends Genet 22:56–63 ²Hsu KC et al (2013) PLoS Comput Biol 9:e1003127+ ³Kaelin WG (2005) Nat Rev Cancer 5:689–698 Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

5 / 24

Introduction

Fast-SL

Results

Conclusions

How to Identify Synthetic Lethals?



Yeast synthetic lethals have been identified experimentally using yeast synthetic genetic arrays¹, ²



Previous in silico approaches have built on the framework of Flux Balance Analysis — restricted to metabolic genes

¹Tong AHY et al (2001) Science 294:2364–2368 ²Tong AHY et al (2004) Science 303:808–813 Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

6 / 24

Introduction

Fast-SL

Results

Conclusions

What is Flux Balance Analysis? ▶

Effective constraint-based method to study genome-scale metabolic networks¹



The mass balance constraints in system of reactions can be represented by a system of linear equations involving reaction fluxes at steady state



The system is under-determined — so we compute the flux distribution that maximises biomass: mathematically, this is a linear programming problem max vbio

(the biomass flux)

s.t.

Σj sij vj = 0

∀i ∈ M (set of metabolites)

LBj ≤ vj ≤ UBj

∀j ∈ J (set of reactions)

¹Varma A & Palsson BO (1994) Applied and Environmental Microbiology 60:3724–3731 Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

7 / 24

Introduction

Fast-SL

Results

Conclusions

Geometrical interpretation of FBA Orth JD et al (2010) Nature Biotechnology 28:245–248

v3

×฀

v3

v1

v1

participating coefficient

Unconstrained solution space v2

Karthik Raman

v3 Optimization maximize Z

Constraints 1) Sv = 0 2) a i < v i < b i

v1

Allowable solution space v2

Optimal solution v2

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

8 / 24

Introduction

Fast-SL

Results

Conclusions

Flux Balance Analysis ▶

FBA has been proven to accurately predict phenotypes following various genetic perturbations¹, ² ▶

To delete reaction k, set vk = 0 and repeat the simulation: max vbio s.t. LBj ≤ vj ≤ UBj

∀i ∈ M ∀j ∈ J

vd = 0

d∈D∈J

Σj sij vj = 0



FBA can also reliably predict synthetic lethal genes in metabolic networks of organisms such as yeast³

¹Edwards JS & Palsson BO (2000) BMC Bioinformatics 1:1 ²Famili I et al (2003) PNAS 100:13134–13139 ³Harrison R et al (2007) PNAS 104:2307–2312 Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

9 / 24

Introduction

Fast-SL

Results

Conclusions

Identifying Synthetic Lethals Brute Force/Exhaustive Enumeration



Single lethals are easier to identify ▶



Solve one optimisation problem for each gene deletion (genotype)

Synthetic lethals are more difficult to identify ▶ ▶ ▶ ▶

Combinatorial Explosion ( ) e.g. 1000 ≈ 170 million simulations! 3 Quickly becomes infeasible for larger organisms … However, simulations are independent and can be easily parallelised on a computer cluster¹

¹Deutscher D et al (2006) Nature Genetics 38:993–8 Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

10 / 24

Introduction

Fast-SL

Results

Conclusions

Identifying Synthetic Lethals Bi-Level Mixed Integer Linear Programming Problem



SL-Finder¹ poses the synthetic lethal identification problem elegantly as a bi-level MILP



Synthetic lethal double and triple reaction deletions have been reported for E. coli



However, the MILP problems become incrementally difficult to solve



Time taken, on a workstation, was ≈ 6.75 days, for E. coli iAF1260 model



MCSEnumerator is another MILP-based method, which runs even faster²

¹Suthers PF et al (2009) Molecular Systems Biology 5:301 ²von Kamp A & Klamt S (2014) PLoS Computational Biology 10:e1003378 Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

11 / 24

Is there a way to surmount the complexity of exhaustive enumeration and bi-level MILP?

Introduction

Fast-SL

Results

Conclusions

An Alternate Approach: Fast-SL Pratapa A et al (2015) Bioinformatics 31:3299–3305



Heavily prunes search space for synthetic lethals, and



Exhaustively iterates through remaining (much fewer) combinations We successively compute:



▶ ▶ ▶



Karthik Raman

Jsl , the set of single lethal reactions, Jdl ⊂ J × J, the set of synthetic lethal reaction pairs, and Jtl ⊂ J3 , the set of synthetic lethal reaction triplets

Central idea: We use FBA to compute a flux distribution, corresponding to maximum growth rate, while minimising the sum of absolute values of the fluxes, i.e. the ℓ1 -norm of the flux vector — the ‘minimal norm’ solution of the FBA LP problem

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

12 / 24

Introduction

Fast-SL

Results

Conclusions

Fast-SL: Eliminating Non-Lethal Sets max vbio

(1)

s.t. S.v = 0

(2)

LBj ≤ vj ≤ UBj

∀j ∈ J

(3)



Identify a flux distribution which obeys the constraints of FBA(2),(3) and also sustains maximum growth(1) (sparse!)



The set of reactions that carry a non-zero flux in this solution is Jnz



How does this help?

Karthik Raman



Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

� ���

13 / 24

Introduction

Fast-SL

Results

Conclusions

Fast-SL Massively Prunes Search Space for Synthetic Lethals ▶



If a reaction j carries zero flux in the minimal norm solution (j ∈ / Jnz ), which is constrained to support growth, it cannot be lethal

���

⇒ There is no single lethal reaction outside Jnz ⇒ The set of all single lethals (Jsl ) is contained entirely in Jnz

J-Jnz

���

All synthetic lethal pairs lie in the narrow ‘red region’ of J × J (drawn to scale for E. coli)

Karthik Raman

J

If a pair of reactions i, j carry zero flux in the minimal norm solution (i, j ∈ / Jnz ), they cannot be a synthetic lethal pair

⇒ There are no synthetic lethal pairs that comprise reactions that are both not in Jnz ▶

Jnz

J



���

Jsl

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

14 / 24

Introduction

Fast-SL

Results

Conclusions

Fast-SL Achieves Massive Speedups Jnz

Even in the narrow red region, further gains are made by re-applying the idea



The gains are even more substantial for higher order lethals:

J



J-Jnz

Order

Exhaustive LPs

Single Double Triple Quadruple

2.05 × 103 1.57 × 106 9.27 × 108 4.10 × 1011

Karthik Raman

Jsl

LPs solved after eliminating non-lethal sets 393 7, 779 432, 487 4.53 × 107

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

Reduction in search-space ≈ 5 fold ≈ 200 fold ≈ 2100 fold ≈ 9050 fold

15 / 24

Introduction

Fast-SL

Results

Conclusions

Fast-SL: Minimum Norm Solution ▶ ▶

Smaller the set of non-zero reactions, Jnz , lesser the number of LPs to be solved for identifying lethal sets Minimised ℓ0 -norm solution of the FBA LP problem finds the sparsest solution ▶



However, it requires solving an MILP problem

We use the ℓ1 -norm solution instead

min. Σj |vj | s.t.

Σj sij vj = 0

∀i ∈ M

LBj ≤ vj ≤ UBj

∀j ∈ J

vbio = vbio,max Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

16 / 24

Introduction

Fast-SL

Results

Conclusions

Fast-SL Achieves 4x Speedup over MCSEnumerator ▶

Fast-SL can also be parallelised, leading to further speed-ups



Fast-SL achieves ≈ 4x speed-up over the MCSEnumerator method¹ for the E. coli iAF120 model for higher order reaction deletions



Results obtained using Fast-SL match precisely with exhaustive enumeration of gene deletions



Similar approach can be used to identify lethal gene sets by incorporating gene–reaction rules

Order of SLs Single Double Triple Quadruple

No. of SLs 278 96 247 402

CPU time taken for MCSEnumerator (using 12 cores) 11 s 39.1 s 16.8 min 18.5 h

CPU time taken for Fast-SL Algorithm (using 6 cores) 2.8 s 17.2 s 8.5 min 9.3 h

Speed-up

≈ 8x ≈ 4x ≈ 4x ≈ 4x

¹von Kamp A & Klamt S (2014) PLoS Computational Biology 10:e1003378 Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

17 / 24

Introduction

Fast-SL

Results

Conclusions

Synthetic Lethal Gene Deletions ▶

Most previous algorithms only computed synthetic reaction deletions



Not easily modified for computing gene deletions



We extended our algorithm to gene deletions by using the gene–reaction mapping



Fast-SL formulation identified 75 new gene triplets in E. coli that were not identified previously



We have also identified up to synthetic lethal gene and reaction quadruplets for other pathogenic organisms such as Salmonella Typhimurium, Mycobacterium tuberculosis, Staphylococcus aureus and Neisseria meningitidis

Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

18 / 24

Introduction

Fast-SL

Results

Conclusions

Missing Biomass Precursors in E. coli ▶ ▶

Gene/reaction lethality is a result of organism’s inability to produce any of the biomass precursors Most triple and quadruple gene deletions affect mechanisms involved in ATP production 50% 40%

30% 20% 10% 0%

Reiterates critical role played by co-factors and ATP in cellular metabolism! Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

19 / 24

Introduction

Fast-SL

Results

Conclusions

Synthetic Lethals Illustrate Complex Metabolic Dependencies



atpB, cydA, gap ▶





eno, pps, sdhA/B/C ▶ ▶

Karthik Raman

ATP synthase, cytochrome D ubiquinol oxidase and glyceraldehyde 3-phosphate dehydrogenase Perhaps bring about their effect by disabling both substrate-level and oxidative phosphorylation Enolase, PEP synthase and succinate dehydrogenase subunits Seem to bring about their effect by affecting production of phosphoenolpyruvate and consequently disabling OXPHOS

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

20 / 24

Introduction

Fast-SL

Results

Conclusions

Combinatorial Drug Targets



Only few combinatorial deletions abolish growth in silico



Re-emphasises the robust nature of the metabolic networks in both M. tuberculosis and S. Typhimurium



28 triplets and 20 doublets in M. tuberculosis have no homologues in human



21 triplets and 39 doublets in S. typhimurium have no homologues



Some of these may be interesting drug targets

Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

21 / 24

Introduction

Fast-SL

Results

Conclusions

Limitations



Metabolic models considered here do not account for regulation or other functions of proteins



The method can identify synthetic lethals only in metabolism Any inadequacies/gaps in the metabolic model will affect the results, e.g. some isozymes may not have been characterised yet





Karthik Raman

Lethality results can be useful to refine the metabolic model

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

22 / 24

Introduction

Fast-SL

Results

Conclusions

Summary ▶

Synthetic lethals are difficult to identify computationally — combinatorial explosion of possibilities



Previous approaches have used FBA to exhaustively search the entire space, or pose the problem as a bi-level MILP Our algorithm, Fast-SL, circumvents the complexities of previous approaches, through a massive reduction of search space, exploiting the minimal norm solution of FBA





For E. coli, the reduction in search space is ≈ 4000-fold for synthetic lethal triplets!



Ours is also the first method that systematically evaluates gene deletions



Our results agree exactly with exhaustive enumeration



Fast-SL finds application in identifying functional associations and combinatorial drug targets

Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

23 / 24

Introduction

Fast-SL

Results

Conclusions

Acknowledgments



Aditya Pratapa



Dr. Shankar Balachandran



High Performance Computing Facility IIT Madras



Funding: Department of Biotechnology, Government of India; IIT Madras; nVidia

Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

24 / 24

Introduction

Fast-SL

Results

Conclusions

Thank you! MATLAB implementation of Fast-SL is available for download from: https://github.com/RamanLab/FastSL

Karthik Raman

Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks

24 / 24

Suggest Documents