how binding incorrect partners can teach us about

2 downloads 0 Views 3MB Size Report
Great interactions: how binding incorrect partners can teach us about protein recognition and function. Lydie Vamparys1, Benoist Laurent1, Alessandra ...
Great interactions: how binding incorrect partners can teach us about protein recognition and function

Lydie Vamparys1, Benoist Laurent1, Alessandra Carbone2,3, and Sophie SacquinMora1§

1Laboratoire

de Biochimie Théorique, CNRS UPR 9080,

Institut de Biologie Physico-Chimique, 13 rue Pierre et Marie Curie, 75005 Paris, France 2Sorbonne

Universités, UPMC Univ-Paris 6, CNRS UMR7238,

Laboratoire de Biologie Computationnelle et quantitative, 15 rue de l’Ecole de Médecine, 75006 Paris, France 3Institut

Universitaire de France, 75005 Paris, France

§Corresponding

author ([email protected])

-1-

Supplementary Information

Supplementary Figure S1: Schematic view of the docking algorithm For each starting position, defined by the Euler angles θ and φ, uniformly spaced around the receptor protein (blue point), the orientation and the distance of the ligand protein (in red) from the receptor are optimised during the energy minimisation.

-2-

10000 9000 8000

#protein pairs

7000 6000 5000 4000 3000 2000 1000 A

0 0

20

40

60

80

100

120

140

160

180

200

220

20 18 16

#proteins

14 12 10 8 6 4 2 0 B

1000

1500

2000

2500

3000

3500

4000

#poses

Supplementary Figure S2: (a) Distribution of the number of kept docking poses for each protein pair after filtering on the interaction energy. (b) Final distribution of the number of kept docking poses for each protein (taking into account all its partners) after filtering on the interaction energy.

-3-

Norm. Error Norm. Error

A

1

1

0.5

0.5

0 0

0.5

1

B

0 0

1

1

0.5

0.5

0 0

0.5 PIP

C

0 0

1 D

0.5

1

0.5

1

PIP

Supplementary figure S3: Evolution of the normalised error as a function of the PIP cutoff for various definitions of the error function. (A)

N orm.Err. =

(B)

N orm.Err. =

(C)

N orm.Err. =

(D)

N orm.Err. =

p (1

Sen.)2 + (1

p Spe.)2 / 2

(1

Sen.)2 + (1

p P rec.)2 / 2

(1

Spe.)2 + (1

p P rec.)2 / 2

Sen.)2 + (1

Spe.)2 + (1

p

p

p

(1

-4-

p P rec.)2 / 3

5

max(# clusters) 10 15

20



!'T incr ●

● ●



● ● ● ●●●● ● ●● ●●● ● ●● ●●● ●● ●●● ● ●● ● ●● ●●●● ● ● ●● ● ●● ● ● ●● ● ● ● ● ●● ● ●● ●● ● ● ● ●● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●●●●●

0





500



1000 1500 # residues

2000

Supplementary Figure S4: Maximum number of binding cluster on the protein surface as a function of the protein’s number of residues for each one of the 168 proteins in the docking benchmark2.0

-5-

0.0

0.4 0.6 Threshold 0.4 0.6 Threshold Antibody Others

0.2

0.2

1AHW_r 1BGX_r 1BVK_r 1DQJ_r 1E6J_r 1JPS_r 1MLC_r 1VFB_r 1WEJ_r 2VIS_r

0.2

0.2

14 # clusters 6 10 #2clusters 6 10 14

0.2

2

1AHW_l 1BGX_l 1BJ1_l 1BVK_l 1DQJ_l 1E6J_l 1FSK_l 1I9R_r 1IQD_l 1JPS_l 1K4C_l 1KXQ_r 1MLC_l 1NCA_l 1NSN_l 1QFW_l 1VFB_l 1WEJ_l 2HMI_r 2JEL_l 2QFW_l 2VIS_l

0.4 0.6 Threshold

0.2

0.2

0.8

0.8

1.0

0.4 0.6 Antigen Threshold

0.0 1.0

0.2

0.0

0.2

0.8

1.01AHW_l

1ACB_r 1AVX_r 1AY7_r 1BVN_r 1CGI_r 1D6R_r 1DFJ_r 1E6E_r 1EAW_r 1EWY_r 1EZU_r 1F34_r 1HIA_r 1KKL_r 1MAH_r 1PPE_r 1TMQ_r 1UDI_r 2MTA_r 2PCC_r 2SIC_r 2SNI_r 7CEI_r

0.8

1.0

0.8

0.0

0.2

0.2 0.2

1.0

0.4 0.6 0.8 1.0 Threshold 0.4 0.6 0.8 1.0 Threshold Bound_Antibody 0.4 Others 0.6 0.8 1.0 Threshold Others

Others

0.0 0.2 0.2

0.0

0.2

0.2 0.4 0.6 Threshold 0.4 0.6 0.8 0.4Threshold 0.6 0.8 Threshold 1BJ1_r 1FSK_r 0.4 1I9R_l 0.6 0.8 1AHW_l 1IQD_r Threshold 1BGX_l 1K4C_r 1BJ1_l 1BVK_l 1DQJ_l 1E6J_l 1FSK_l 1I9R_r 1IQD_l 1JPS_l 1K4C_l 1KXQ_r 1MLC_l 1NCA_l 1NSN_l 1QFW_l 1VFB_l 1WEJ_l 2HMI_r 2JEL_l 2QFW_l 2VIS_l

0.8 1.0 1.0

0.4 0.6 Threshold

1BGX_l 1BJ1_l 1BVK_l 1ACB_r 1DQJ_l 1E6J_l1AVX_r 1FSK_l1AY7_r 1I9R_r1BVN_r 1IQD_l1CGI_r 1JPS_l1D6R_r 1K4C_l1DFJ_r 1E6E_r 1KXQ_r 1EAW_r 1MLC_l 1EWY_r 1NCA_l 1EZU_r 1NSN_l 1F34_r 1QFW_l 1VFB_l1HIA_r 1KKL_r 1WEJ_l 2HMI_r1MAH_r 2JEL_l1PPE_r 1TMQ_r 2QFW_l 2VIS_l1UDI_r 2MTA_r 2PCC_r 2SIC_r 2SNI_r 7CEI_r

0.8

1.0

0.8

1.0

0.0

0.2

1K5D_r 1KAC_l 1KAC_r 1KLU_l 1KLU_r

0.8

Inhibitor

0.4 0.6 Threshold

0.8 0.0

1.0 0.2

0.4 0.6 Threshold

14

# clusters 6 10 2

# clusters 6 10

0.4 0.6 Threshold

0.8

0.0 1.0

1ACB_l 1AVX_l 1AY7_l 1BVN_l 1CGI_l 1D6R_l 1DFJ_l 1E6E_l 1EAW_l 1EWY_l 1EZU_l 1F34_l 1HIA_l 1KKL_l 1MAH_l 1PPE_l 1TMQ_l 1UDI_l 2MTA_l 2PCC_l 2SIC_l 2SNI_l 7CEI_l

1ACB_l 1AVX_l 1AY7_l 1BVN_l 1CGI_l 1D6R_l 1DFJ_l 1E6E_l 1EAW_l 1EWY_l 1EZU_l 1F34_l 1HIA_l 1KKL_l 1MAH_l 1PPE_l 1TMQ_l 1UDI_l 2MTA_l 2PCC_l 2SIC_l 2SNI_l 7CEI_l

0.2

0.4 0.6 Threshold

0.8

1.0

8(

8(

-6-

0.0

0.0

1SBB_r 1WQ1_l 1WQ1_r 2BTF_l 2BTF_r

1.0

0.0

1.0

9( 0.2

0.4 Th

O

0.2

0.8

1.0

0.2

1BJ1_r 1FSK_r 1I9R_l 1IQD_r 1K4C_r 1KXQ_l 1NCA_r 1NSN_r 1QFW_r 2HMI_l 2JEL_r 2QFW_r

Inhibitor

0.0

0.2

0. T

1FQ1_l 1FQ1_r 1FQJ_l 1I2M_r 1FQJ_r 1I4D_l 1GCQ_l 1I4D_r 1GCQ_r 1IB1_l 1GHQ_l 1IB1_r 1M10_l 1GHQ_r 1IBR_l 1M10_r 1GP2_l 1IBR_r 1ML0_l 1GP2_r 1IJK_l 1ML0_r 1GRN_l 1IJK_r 1N2C_l 1GRN_r 1K5D_l 1N2C_r 1H1V_l 1K5D_r 1QA9_l 1H1V_r 1KAC_l 1QA9_r 1HE1_l 1KAC_r 1RLB_l 1KLU_l 1RLB_r 1KLU_r 1SBB_l

9(

0.4 0.6 Threshold

0.0

8(

1M10_l 1M10_r 1ML0_l 1ML0_r 1N2C_l 1N2C_r 1QA9_l 1QA9_r 1RLB_l 1RLB_r 1SBB_l 1SBB_r 1WQ1_l 1WQ1_r 2BTF_l 2BTF_r

Others

0.4 0.6 Threshold

1.0 1AKJ_l 1AKJ_r 1ATN_l 1ATN_r 1B6C_l 1FQ1_l 1B6C_r 1FQ1_r 1BUH_l 1FQJ_l 1BUH_r 1FQJ_r 1DE4_l 1GCQ_l 1DE4_r 1GCQ_r 1I2M_r 1E96_l 1GHQ_l 1I4D_l 1E96_r 1GHQ_r 1I4D_r 1EER_l 1GP2_l 1IB1_l 1EER_r 1GP2_r 1IB1_r 1F51_l 1GRN_l 1IBR_l 1F51_r 1GRN_r 1IBR_r 1H1V_l 1IJK_l 1H1V_r 1IJK_r 1HE1_l 1K5D_l

1.0

1KXQ_l 1NCA_r 1NSN_r 1QFW_r 2HMI_l 2JEL_r 2QFW_r

O

# clusters 5 10 20

14 # clusters 6 10 2

0.0

Antigen

Enzyme

0.4 0.6 Threshold 0.4 0.6 Threshold

Others

0.0 Bound_Antibody 0.0

1BJ1_r 1FSK_r 1I9R_l 1IQD_r 1I2M_r 1K4C_r 1I4D_l 1KXQ_l 1I4D_r 1NCA_r1IB1_l 1NSN_r1IB1_r 1QFW_r1IBR_l 2HMI_l 1IBR_r 2JEL_r 1IJK_l 2QFW_r1IJK_r 1K5D_l 1K5D_r 1KAC_l 1KAC_r 1KLU_l 1KLU_r

0.8 1.0

Inhibitor Supplementary Figure S5: Individual clustering profiles for all the proteins in the Docking Benchmark 2.0

14

Enzyme

Others

0.2

2

14

10

6

2

0.0 1.0

1ACB_r 1AVX_r 1AY7_r 1BVN_r 1CGI_r 1D6R_r 1DFJ_r 1E6E_r 1EAW_r 1EWY_r 1EZU_r 1F34_r 1HIA_r 1KKL_r 1MAH_r 1PPE_r 1TMQ_r 1UDI_r 2MTA_r 2PCC_r 2SIC_r 2SNI_r 7CEI_r

0.0 1.0 0.0

0.8

0.4 0.6 Threshold

1.0

0.0

Antigen

0.4 0.6 Threshold

0.4 0.6 0.8 1.0 Threshold 0.4 0.6 0.8 1.0 Threshold Bound_Antibody

0.2

0.2 0.4 0.6 0.4 0.6Threshold 0.8 Threshold Others

1ACB_l 1AVX_l 1AY7_l 1BVN_l 1CGI_l 1D6R_l 1DFJ_l 1E6E_l 1EAW_l 1EWY_l 1EZU_l 1F34_l 1HIA_l 1KKL_l 1MAH_l 1PPE_l 1TMQ_l 1UDI_l 2MTA_l 2PCC_l 2SIC_l 2SNI_l 7CEI_l

1AKJ_l 1AKJ_r 1ATN_l 1ATN_r 1B6C_l 1B6C_r 1BUH_l 1BUH_r 1DE4_l 1DE4_r 1E96_l 1E96_r 1EER_l 1EER_r 1F51_l 1F51_r

Others

0.0

8(

14

0.0

0.0

0.2

1.0

1AKJ_l 1AKJ_r 1ATN_l 1AHW_r 1ATN_r 1BGX_r 1B6C_l 1BVK_r 1B6C_r 1DQJ_r 1BUH_l 1I2M_r 1E6J_r 1BUH_r 1I4D_l 1JPS_r 1DE4_l 1I4D_r 1MLC_r 1DE4_r 1IB1_l 1VFB_r 1E96_l 1IB1_r 1WEJ_r 1E96_r 1IBR_l 2VIS_r 1EER_l 1IBR_r 1EER_r 1IJK_l 1F51_l 1IJK_r 1F51_r 1K5D_l 1K5D_r 1KAC_l 1KAC_r 1KLU_l 1KLU_r

# clusters 6 10 14 # clusters 2 6 10

0.8

0.8

2

# clusters # clusters 2 6 10 #20 clusters 5 10 5 10 20

14

0.4 0.6 Threshold

1.0

Others

0.0

# clusters 2 6 10 # clusters 5 10 20

1AHW_r 1BGX_r 1BVK_r 1DQJ_r 1E6J_r 1JPS_r 1MLC_r 1VFB_r 1WEJ_r 2VIS_r

0.8

0.0 0.2

14

14 # clusters #2 clusters 6 10 5 10 20

0.2

0.0

1ACB_l 1AVX_l 1AY7_l 1AKJ_l 1BVN_l 1AKJ_r 1CGI_l 1ATN_l 1D6R_l 1ATN_r 1DFJ_l 1B6C_l 1E6E_l 1B6C_r 1EAW_l 1BUH_l 1EWY_l 1BUH_r 1EZU_l 1DE4_l 1F34_l 1DE4_r 1HIA_l 1E96_l 1KKL_l 1E96_r 1MAH_l 1EER_l 1PPE_l 1EER_r 1TMQ_l 1F51_l 1UDI_l 1F51_r 2MTA_l 2PCC_l 2SIC_l 2SNI_l 7CEI_l

Others

0.0

Antibody

0.2

1.0

Inhibitor

0.0

0.0 1.0

0.8

# clusters 6 10

0.4 0.6 Threshold

14

1.0

0.2

Inhibitor

Others

2

1ACB_r 1AVX_r 1AY7_r 1BVN_r 1CGI_r 1D6R_r 1DFJ_r 1E6E_r 1EAW_r 1EWY_r 1EZU_r 1F34_r 1HIA_r 1KKL_r 1MAH_r 1PPE_r 1TMQ_r 1UDI_r 2MTA_r 2PCC_r 2SIC_r 2SNI_r 7CEI_r

1.0 0.0

1.0

14

0.8

1ACB_r 1AVX_r 1AY7_r 1BVN_r 1CGI_r 1D6R_r 1DFJ_r 1E6E_r 1EAW_r 1EWY_r 1EZU_r 1F34_r 1HIA_r 1KKL_r 1MAH_r 1PPE_r 1TMQ_r 1UDI_r 2MTA_r 2PCC_r 2SIC_r 2SNI_r 7CEI_r

Enzyme

0.8

# clusters 6 10

14 # clusters 6 10

0.4 0.6 Threshold

2

0.2

0.4 0.6 Threshold

2

1AHW_l 1BGX_l 1BJ1_l 1BVK_l 1DQJ_l 1E6J_l 1FSK_l 1I9R_r 1IQD_l 1JPS_l 1K4C_l 1KXQ_r 1MLC_l 1NCA_l 1NSN_l 1QFW_l 1VFB_l 1WEJ_l 2HMI_r 2JEL_l 2QFW_l 2VIS_l

Antigen

0.2

# clusters 5 10 20

0.0

# clusters # clusters 5 10 20 5 10 20

1.0

1FSK_l 1I9R_r 1IQD_l 1JPS_l 1K4C_l 1KXQ_r 1MLC_l 1NCA_l 1NSN_l 1QFW_l 1VFB_l 1WEJ_l 2HMI_r 2JEL_l 2QFW_l 2VIS_l

0.4 Th

# clusters 5 10 20

0.8

# clusters # clusters 5 10 5 10 20 # clusters 20 5 10 20

0.4 0.6 Threshold

# clusters # clusters # clusters 5 105 1020 20 5 10 20

0.2

# clusters 5 10 20

0.0

# clusters 6 10

1.0

1NSN_r 1QFW_r 2HMI_l 2JEL_r 2QFW_r

2

2

# cl 6

1VFB_r 1WEJ_r 2VIS_r

1ACB_l 1AVX_l 1AY7_l 1BVN_l 1CGI_l 1D6R_l 1DFJ_l 1E6E_l 1EAW_l 1EWY_l 1EZU_l 1F34_l 1HIA_l 1KKL_l 1MAH_l 1PPE_l 1TMQ_l 1UDI_l 2MTA_l 2PCC_l 2SIC_l 2SNI_l 7CEI_l

8(

0.2

0. T

A: 1F34_r+1g0v

B: 1MAH_r + 4qww

C: 1IJK_l+1m10

D: 1ML0_l+2bdn

Supplementary Figure S6: Mapping the PIP values on the surface of proteins with alternate binding interfaces. High PIP residues are shown in blue and low PIP residues are shown in red. The reference experimental partner is shown as a black cartoon and the alternate partner as a green cartoon.

-7-

1 0.9 0.8 0.7

AUC

0.6 0.5 0.4 Enzymes Inhibitors Antigens Antibodies Bound Antibodies Others

0.3 0.2 0.1 0 0

20

40

60

80

100

120

140

160

Protein number

Supplementary Figure S7: AUC value for each protein in the benchmark, the dot colors indicate the proteins functional group. The dots for the CMTI-1 squash inhibitor (1PPE_l) and the T-cell receptor β (1SBB_r) that are discussed in the manuscript are highligted with a red circle.

-8-

1

Sensitivity

0.8 0.6 0.4 0.2 0 0

0.2 0.4 0.6 0.8

1

1-Specificity

A

1

Precision

0.8 0.6 0.4 0.2 0 0 B

0.2 0.4 0.6 0.8

1

Sensitivity

Supplementary Figure S8: PIP predictions when including alternate experimental interfaces (red lines) of using only residues from the reference experimental interfaces (black lines) (a) ROC curves, the diagonal dotted line corresponds to random predictions. (b) Precision/ Sensitivity curves, the dashed horizontal lines correspond to random predictions for each case.

-9-

Supplementary Table S1 - Interface residues prediction depending on conformational change upon binding Protein dataset Complete benchmark (48161 residues)

AUC

Errmin PIPmin

Cov.

Sen.

Spec.

Prec.

0.77

0.41

0.09

32 %

71 %

71 %

17 %

Rigid (31061 residues)

0.77

0.41

0.11

32 %

70 %

72 %

19 %

Medium (10107 residues) Difficult (6993 residues)

0.78 0.77

0.40 0.41

0.07 0.07

31 % 30 %

71 % 70 %

72 % 73 %

17 % 15 %

Results of the interface residues prediction using the PIP index for the complete benchmark, or depending on the protein’s conformational change upon binding. All values in the Cov., Sen., Spec. and Prec. columns are obtained with the optimal PIPmin value (column 4) which corresponds to the minimum error in column 3.

- 10 -

Suggest Documents