Predicting class II MHC/peptide multi-level binding with an iterative ...

3 downloads 49285 Views 101KB Size Report
Availability: Stepwise discriminant analysis soft- ware is available commercially in SPSS and BMDP statistical software packages. Peptides known to bind MHC ...
Vol. 17 no. 10 2001 Pages 942–948

BIOINFORMATICS

Predicting class II MHC/peptide multi-level binding with an iterative stepwise discriminant analysis meta-algorithm R. R. Mallios Office of Sponsored Projects and Research, University of California, San Francisco, 2615 East Clinton Avenue, Fresno, CA 93703, USA Received on April 20, 2001; revised and accepted on July 12, 2001

ABSTRACT Motivation: Predicting peptides that bind to both Major Histocompatibility Complex (MHC) molecules and T cell receptors provides crucial information for vaccine development. An agretope is that portion of a peptide that interacts with an MHC molecule. The identification and prediction of agretopes is the first step towards vaccine design. Results: An iterative stepwise discriminant analysis metaalgorithm is utilized to derive a quantitative motif for classifying potential agretopes as high-, moderate- or nonbinders for HLA-DR1, a class II MHC molecule. A large molecular online database provides the input for this datadriven algorithm. The model correctly classifies over 85% of the peptides in the database. Availability: Stepwise discriminant analysis software is available commercially in SPSS and BMDP statistical software packages. Peptides known to bind MHC molecules can be downloaded from http://wehih.wehi.edu.au/mhcpep/. Peptides known not to bind HLA-DR1 are available from the author upon request. Contact: [email protected]

INTRODUCTION The initial immune response to an extracellular pathogen begins with the capture of the pathogen by a macrophage, dendritic cell, or B lymphocyte. In the cell’s interior, the protein portion of the peptide is degraded into peptide fragments. Class II Major Histocompatibility Complex (MHC) molecules bind to areas of the peptide fragments that are designated agretopes. The agretope/MHC complex travels to the cell surface where the class II MHC molecule displays the fragment to nearby CD4 T lymphocytes. When a CD4 T lymphocyte binds to the exposed area of the peptide fragment, designated an epitope, an immune response is initiated. Each binding peptide fragment is comprised of a linear arrangement of amino acid residues. Knowledge of the amino acid sequence of an agretope is useful in vaccine 942

development and immunotherapy. A motif or quantitative model that recognizes agretopes can be used to screen large numbers of potential binding peptides, reducing laboratory time and costs. The class II MHC binding site has been shown to bind ligands of 9–25 residues. x-ray crystallography reveals that the binding site is open at both ends (Brown et al., 1993; Stern et al., 1994). This makes agretope prediction difficult because it is not known which segment of a binding peptide is actually involved in the binding. An algorithm for motif derivation must include establishing proper alignment as well as motif extraction. Early studies of agretope prediction focused on extracting motifs from sequenced peptides known to bind various class II MHC molecules. Chicz et al. (1992) isolated and sequenced peptides bound to DR1. They aligned the resulting sequences based on positions 1, 6, and 10 to extract a putative motif. Pool sequencing of natural DR ligands provided Falk et al. (1994) with amino acid concentrations at positions 1 through 16. Their motif for DR1 selected positions 1, 4, and 9 as major anchors. Hammer et al. (1992, 1993) used M13 phages to display a library of 9-residue peptides. Sequence analysis of the peptide encoding regions of the 60 phages that bound to DR1 provided 60 aligned 9-residue binding peptides. Utilizing the sequences of the bound peptides, histograms were created to display the amino acid composition of each of the 9 positions. Assessing the accuracy of an agretope prediction algorithm requires evaluation of non-binding peptides as well as binders. O’Sullivan et al. (1991b) evaluated the following DR binding motif on a database of 110 DR1 binders and 109 DR1 non-binders: W, F, Y, V, I, L in position 1; A, V, I, L, P, C, S, T in position 6; and A, V, I, L, C, S, T, M, Y in position 9. They reported that the motif was present in 69% of good binders, 55% of intermediate binders, 31% of weak binders and 16.5% of negative binders. Brusic et al. (1998) analyzed a large database of 338 DR4 binders and 312 DR4 c Oxford University Press 2001 

Class II MHC quantitative binding motifs

non-binders using an evolutionary algorithm and an artificial neural network. In cross-validation, their method correctly classified 83% of high-affinity binders, 73% of moderate-affinity binders, 50% of low-affinity binders and 77% of zero-affinity binders. When the method was used to classify a new set of data, the comparable results were 100, 90, 30 and 70%. Mallios (1999) introduced, in detail, an iterative Stepwise Discriminant Analysis (SDA) meta-algorithm that classifies peptides into those that bind a given class II MHC molecule and those that do not bind. Classification of 526 DR1 binders and 98 DR1 non-binders yielded sensitivity = 97%, specificity = 76%, and accuracy = 94%. The multi-level prediction problem of classifying peptides into categories of binding affinity is much more difficult than the dichotomous problem. Towards resolving the multi-level problem, many groups have worked towards measuring and recording peptide/MHC binding affinities. O’Sullivan et al. (1991a,b), Marshall et al. (1995), and Fleckenstein et al. (1996) utilized competition assays to facilitate quantification. In this strategy to measure binding affinity, a reference peptide that is known to bind is labeled. Quantities of the peptide to be evaluated are added until 50% of the reference peptide is displaced. IC50 is defined as the concentration of the test peptide that displaces 50% of the reference peptide (Gulukota et al., 1997). Thus, a low IC50 implies a strong binding. All IC50 levels are relative to the reference peptide. Currently, the binding affinities found in public databases are based on a variety of experimental methods for assaying peptide binding. As such, a peptide reported as a high binder by one method might be classified as a moderate binder by another method. As laboratory methods and conditions become standardized, quantitative analyses of class II MHC binding data will improve in accuracy. This study explores expansion of the dichotomous iterative SDA meta-algorithm to the general multi-level problem. It seeks to ascertain if the algorithm is relevant and if so, how it compares with other approaches.

affinities, were downloaded for analysis in this study. A database published by O’Sullivan et al. (1990) provides sets of non-binding peptides for DR1, DR2, DR5, and DR52a. Ninety-seven peptides that do not bind DR1 were obtained from this source. Given a set of observations that have been classified into mutually exclusive sets, SDA (Dixon et al., 1990) builds Bayesian discriminant functions that classify observations into one of the defined sets. For this application, classification is based upon binding affinity. Each level of binding is assigned a categorical outcome variable of 0 to m. A dataset is composed of subsequences of length n derived from the peptides of interest, along with the appropriate outcome variable, 0 to m, denoting the binding affinity of the peptide of origin. For each observation, 20 × n possible predictor variables A1, C1, D1, E1, F1, G1, H1, I1, K1, L1, M1, N1, P1, Q1, R1, S1, T1, V1, W1, Y1, . . . , An, Cn, Dn, En, Fn, Gn, Hn, In, Kn, Ln, Mn, Nn, Pn, Qn, Rn, Sn, Tn, Vn, Wn, Yn are calculated. Each predictor variable refers to a specific amino acid residue occupying a specific position in the subsequence. The single-letter abbreviations for amino acid residues are listed alphabetically. Each potential predictor variable is assigned a 1 if the designated residue occupies the position in question, a 0 otherwise. Analyzing datasets with SDA produces one set of coefficients for each binding level. If j is the binding level and i is the number of steps completed in the SDA, the classification function is

SYSTEMS AND METHODS The MHCPEP Database (Brusic et al., 1997) is a source for peptides known to bind MHC molecules. It is located on the internet at http://wehih.wehi.edu.au/mhcpep/. The description reads, ‘MHCPEP is a curated database comprising over 13 000 peptide sequences known to bind MHC molecules. Entries are compiled from published reports as well as from direct submissions of experimental data. Each entry contains the peptide sequence, its MHC specificity and, when available, experimental method, observed activity, binding affinity, source protein, anchor positions, and publication references.’ Binding affinities are expressed as high, moderate or low. The 526 peptide sequences that bind DR1, along with their binding

P j (probability that subsequence belongs to binding level j) = eu j /i=0,m eu i

u j = b0 j + b1 j v1 j + b2 j v2 j + · · · + bi j vi j where v1 j through vi j are the predictor variables selected by SDA and b1 j through bi j are the corresponding coefficients. Since the value of all predictor variables is either 0 or 1, the value of u j reduces to the sum of the coefficients of the variables present in the subsequence plus the constant b0 j . Classification functions are converted to the probability of set membership by the following relationship:

The predicted classification of a subsequence is determined by selecting the binding level that is associated with the greatest P j (or u j ). To evaluate the effectiveness of a set of classification functions, the actual binding level and the predicted classification level for each observation are compared to determine accuracy. The jack-knife method of cross-validation (Afifi and Clark, 1990) is also reported. It is a special case of the general cross-validation method in which the classification functions are computed on a subset of cases, and the probability of misclassification is estimated from the 943

R.R.Mallios

INITIALIZATION 1. BUILD NON-BINDING DATASET: Enter every subsequence of length n from each non-binding peptide. 2. BUILD INITIAL BINDING DATASET: Enter every subsequence of length n from each binding peptide.. 3. BUILD CURRENT MODEL: Using STEPWISE DISCRIMINANT ANALYSIS and the above mentioned datasets. BUILD CURRENT BINDING DATASET: Using the CURRENT MODEL, select best subsequence of length n from each binding peptide. STEPWISE DISCRIMINANT ANALYSIS BUILD NEW MODEL: Using STEPWISE DISCRIMINANT ANALYSIS, the CURRENT BINDING DATASET, and the NON-BINDING DATASET.

NEW MODEL EQUALS CURRENT MODEL?

YES EXIT

NO NEW MODEL BECOMES CURRENT MODEL.

Fig. 1. The iterative algorithm.

remaining cases. In the jack-knife method, the first case is set aside while a classification function is computed on all remaining cases. The first case is evaluated by the classification function and tallied as being correctly or incorrectly classified. The process continues with the second case until each case has been left out in turn and classified.

ALGORITHM Figure 1, the same flow chart that describes the dichotomous algorithm, describes the multi-level meta-algorithm. Initially, the permanent non-binding dataset is created by entering every subsequence of length n from each nonbinding peptide. Similarly, the initial binding dataset is created by entering every subsequence of length n from each binding peptide. The initial application of SDA produces one classification function for each binding level as described previously. In each subsequent iteration the non-binding dataset remains the same, while the binding dataset is created anew. For each binding peptide, the appropriate classification function is selected according to the binding level, j, of the peptide. For a given peptide, the value of P j is calculated for each subsequence of length n. The subsequence with the largest value of P j is selected to represent the peptide of the binding dataset. SDA is performed again and the process is repeated until the new classification functions are the same as the previous set. The coefficients of the converged classification functions are reported in tabular form as the classification model. Evaluation of the model is reported in terms of accuracy for the final model and the jack-knife procedure. 944

IMPLEMENTATION The algorithm is demonstrated with DR1 binding peptides that have binding affinities of none, moderate, or high. Subsequences of length 9 are used because (i) many previous studies suggest motifs of length 9, and (ii) the shortest sequences in MHCPEP database that bind DR1 consist of nine amino acid residues. The moderate and high binders were selected from the file of 526 peptides that bind DR1. The resulting database contains 171 moderate binders and 230 high binders. In the initialization step, all subsequences of length 9 are entered into the binding dataset and the non-binding dataset. Binding levels are assigned categorical outcome variable values of 0 for none, 1 for moderate, and 2 for high. For example, the peptide hemagglutinin 306–318 is represented by the sequence PKYVKQNTLKLAT and is classified as a high DR1 binder. All five subsequences, PKYVKQNTL, KYVKQNTLK, YVKQNTLKL, VKQNTLKLA, and KQNTLKLAT, are entered into the initial binding dataset. Each subsequence is accompanied by an outcome value of 2 because the peptide of origin is a high binder. Similarly, HEL 91–106 (SVNCAKKIVSDGDGMN) does not bind DR1. The subsequences SVNCAKKIV, VNCAKKIVS, NCAKKIVSD, CAKKIVSDG, AKKIVSDGD, KKIVSDGDG, KIVSDGDGM, and IVSDGDGMN are all entered into the non-binding dataset with an outcome value of 0. The 97 peptides in the nonbinding dataset produce 743 subsequences of length 9. Unlike the binding dataset that evolves with each iteration, the non-binding dataset remains static. The first application of SDA produces three classification functions. These functions are used to select the subsequences for the new binding dataset. The coefficients utilized in evaluating the sequence PKYVKQNTLKLAT appear in Table 1. P2 is calculated for each subsequence as follows. u 2 (PKYVKQNTL) = −12.3 + 5.34 + 2.86 + 3.29 + 4.15 +4.57 = 7.91;

u 0 = 6.13;

u 1 = 9.70

u 2 (KYVKQNTLK) = −12.3 + 2.72 + 5.98 + 3.46 + 4.75 +4.3 = 8.91;

u 0 = 6.6;

u 1 = 10.81

u 2 (YVKQNTLKL) = −12.3 + 4.1 + 2.9 + 5.53 + 4.53 +4.57 = 9.33;

u 0 = 6.71;

u 1 = 10.89

u 2 (VKQNTLKLA) = −12.3 + 3.1 + 3.73 + 3.41 + 4.1 + 4.3 +2.32 = 8.66;

u 0 = 6.14;

u 1 = 10.05

u 2 (KQNTLKLAT) = −12.3 + 2.41 + 4.89 + 5.53 + 3.66 +4.35 = 8.54;

u 0 = 6.3;

u 1 = 10.04

P2 (PKYVKQNTL) = e7.91 /(e6.13 + e9.7 + e7.91 ) = 0.140 P2 (KYVKQNTLK) = e8.91 /(e6.6 + e10.81 + e8.91 ) = 0.129 P2 (YVKQNTLKL) = e9.33 /(e6.71 + e10.89 + e9.33 ) = 0.172

Class II MHC quantitative binding motifs

Table 1. First iteration DR1 classification model, selected predictor variables for PKYVKQNTLKLAT

Residue Constant −9.21 None −11.23 Moderate −12.30 High

Position 1

2

3

4

5

6

7

A

None 2.23 Moderate 2.52 High 3.96

K

None Moderate High

L

None 1.76 Moderate 2.54 High 2.74

N

None Moderate High

P

None Moderate High

3.93 2.40 2.38 3.67 2.94 1.55 1.55 2.99 3.43 2.18 2.18 3.55

Q

None Moderate High

2.85 2.72 4.12 3.69 3.46 2.86

T

None Moderate High

V

None 1.61 Moderate 3.14 High 3.10

Y

None 2.96 1.82 3.78 Moderate 4.04 3.14 5.47 High 4.10 2.72 5.34

8

9

2.02 2.05 2.53 2.49 3.89 2.41 1.00 2.41 2.07 3.40 2.92 4.50 3.66 2.14 3.16 2.82 3.73 3.24 5.07 3.66 2.32 4.65 5.13 4.53 1.90 2.44 3.08 2.27 3.76 2.35 1.91 2.54 2.80 4.31 3.27 5.26 3.91 4.04 3.50 3.63 4.89 4.10 5.53 4.30 4.57 4.02 3.22 2.52

4.63 4.36 3.73

1.63 0.60 0.10

4.11 2.80 2.88 1.33 3.29 1.80

3.28 1.11 4.46 2.97 4.18 2.48

3.20 3.48 2.64 4.14 2.82 3.07 3.01 4.45 3.66 5.00 4.84 5.02 2.41 3.41 2.90 4.75 4.15 4.35 4.65 5.87 5.98

5.01 2.94 5.79 2.80 2.42 6.04 4.16 7.07 4.40 4.15 6.07 4.10 7.24 4.52 4.17 3.94 2.47 3.10

3.98 2.67 2.60

P2 (VKQNTLKLA) = e8.66 /(e6.14 + e10.05 + e8.66 ) = 0.198 P2 (KQNTLKLAT) = e8.54 /(e6.3 + e10.04 + e8.54 ) = 0.179.

Since P2 (VKQNTLKLA) is the largest, VKQNTLKLA is selected to represent the longer peptide in the new binding dataset. A subsequence is selected from each binding peptide in accordance with the binding level j. The new binding dataset and the static non-binding dataset are analyzed with SDA to produce a new set of classification functions. Iteration continues until the classification functions converge. In this example, convergence occurred on the eighth iteration. Table 2 displays the final classification functions. Each subsequence in the final dataset is classified on the basis of the P j with the highest value. YVKQNTLKL was the subsequence representing PKYVKQNTLKLAT in the

final binding dataset. Using the classification functions in Table 2, P0 (YVKQNTLKL) = 0.0; P1 (YVKQNTLKL) = 0.0; and P2 (YVKQNTLKL) = 1.0. Thus, YVKQNTLKL is classified as a high binder, which is a correct classification. Table 3 evaluates the performance of the final model and Table 4 describes the jack-knife cross-validation classification. Where Tables 3 and 4 report results based on subsequences of length 9 in the non-binding dataset, Table 5 reports results based on entire non-binding peptides. All model evaluations indicate high performance levels, with Tables 3 and 4 reporting accuracies greater than 90% and Table 5 reporting an accuracy of 87.6%.

DISCUSSION Interpreting the results Discriminant analysis, unlike regression analysis, assumes the outcome variable is categorical and not ordered. SDA focuses on finding predictor variables that separate one group from another. Thus, the coefficients in Table 2 describe variables that are important in separating highbinders, moderate-binders, and non-binders. In most binding matrices, such as the ARB matrix developed by Southwood et al. (1998) and the dichotomous iterative SDA meta-algorithm (Mallios, 1999), there is only one entry per residue-position. The single entry can be interpreted as the relative binding affinity of the residueposition. The three coefficients in each residue-position of the matrix displayed in Table 2 allow for different binding patterns among high-binders and moderate-binders. Variables that act in the familiar way of contributing equally to moderate and high binding have forms similar to L4[1.1, 3.7, 4.2], M4[0.2, 2.6, 3.8], E6[1.6, −0.3, −0.4], and E7[1.5, −0.1, −0.7]. F1[1.3, 0.8, 7.9], W1[1.8, 1.8, 7.4], and Y1[1.2, 1.1, 11.1] strongly support high binding, while V1[2.6, 12.2, 3.4], F4[1.3, 6.6, 1.1], and A5[1.6, 8.0, 1.3] strongly support moderate binding. Y2[1.1, 9.6, 6.4] is found most often in moderate binders, but frequently appears in high binders. To summarize the highlights of Table 2, high binding is significantly promoted by F1, W1, Y1, F2, W2, Y2, M5, M6, I9 and L9; while moderate binding is primarily influenced by V1, A2, Y2, Y3, F4, A5, F9 and I9. These results suggest that moderate-binders follow a different pattern from high-binders. A similar study using regression analysis can corroborate or challenge this conclusion. Regression analysis, however, requires standardized precise reliable measurements of binding affinity. Comparison with other studies A review of the few studies that have examined the multilevel binding problem illustrates the broad spectrum of 945

R.R.Mallios

Table 2. Final DR1 classification model Residue Constant −3.1 −16.8 −14.6

Position 1

2

A

None Moderate High

E

None Moderate High

F

None Moderate High

G

None Moderate High

H

None Moderate High

I

None Moderate High

K

None Moderate High

L

None Moderate High

M

None Moderate High

N

None Moderate High

P

None Moderate High

Q

None Moderate High

0.9 1.4 4.7

R

None Moderate High

1.8 4.1 3.6

T

None Moderate High

V

None Moderate High

2.6 12.2 3.4

1.2 2.0 5.0

W

None Moderate High

1.8 1.8 7.4

2.2 5.2 11.9

Y

None Moderate High

1.2 1.1 11.1

1.1 9.6 6.4

946

3

4

5

6

1.6 8.0 1.3

1.6 1.3 5.0

7

8

9

1.1 3.4 2.0

1.2 4.2 2.8

None Moderate High 1.7 6.8 3.8

0.8 −0.6 4.4

1.6 −0.3 −0.4 1.3 0.8 7.9

1.1 2.8 5.4

0.6 −0.7 5.7

1.5 −0.1 −0.7

1.3 6.6 1.1

2.3 6.2 2.1 2.6 4.9 2.6

1.7 2.6 4.9 1.0 4.3 0.7 1.1 5.2 1.1

1.3 1.4 4.3

1.5 2.5 3.9

1.1 5.9 0.3

2.2 11.5 7.3

2.0 −1.2 −0.1 1.7 3.5 3.4

1.1 3.7 4.2

1.7 2.1 5.2

0.2 2.6 3.8

2.4 5.8 11.1

1.5 0.4 7.6

2.0 2.2 4.5

2.3 5.7 3.2

1.2 3.3 0.1

1.7 1.0 −0.7

1.7 5.8 4.6

1.7 5.3 8.8

1.9 0.4 4.4

1.2 1.2 5.6

1.9 −0.2 0.1

2.1 2.6 0.3 1.1 1.5 4.5

0.9 3.6 1.7

1.3 1.1 3.6 1.9 1.1 −0.2 1.7 4.6 0.5

1.1 5.6 −0.3

1.0 1.5 3.7

2.2 9.4 3.5

2.0 2.5 5.5

2.3 5.6 2.7

2.0 5.6 0.9

Class II MHC quantitative binding motifs

Table 3. Performance of final DR1 classification model

Experimental binding affinity High Moderate Non-binder Total accuracy

High

Moderate

212 4 26

4 156 17

Predicted Non-binder 14 11 700

Percent correct 92.2 91.2 94.2 93.4

Table 4. Jack-knife cross-validation classifications

Experimental binding affinity High Moderate Non-binder Total accuracy

High

Moderate

208 5 35

4 151 21

Predicted Non-binder 18 15 687

Percent correct 90.4 88.3 92.5 91.5

approaches that are being pursued. Studying DR4 binding, Marshall et al. (1995) calculated the relative effects of all amino acids at the central 11 positions of 13-residue peptides for a set of structurally diverse peptides. Each relative effect was calculated as a ratio of two IC50 values. The denominator was determined by competing the peptide of interest with a peptide with all but the anchor residue in position 3 replaced by alanine residues. The numerator was determined by competing the peptide of interest with a peptide with all but the anchor residue and the residue of interest replaced by alanine. When the relative effects were used to predict (by calculating the product of all residue relative effects) the binding affinity of all 13-residue peptides in human myelin basic protein, all predicted values were within the experimental error of the measured IC50 values. Fleckenstein et al. (1996) used a randomized X11 synthetic combinatorial peptide library and 220 sublibraries (11 positions × 20 amino acids) to estimate the influence of each residue in each position on DR1 binding. Relative Table 5. Performance of final DR1 classification model with non-binders classified by complete peptides

Experimental binding affinity High Moderate Non-binder Total accuracy

High

Moderate

212 4

4 156 29

Predicted Non-binder 14 11 68

Percent correct 92.2 91.2 70.1 87.6

competition values were determined for each residue by competing its library with the X11 library. Those amino acid residues with the strongest relative competition values were Y2, M2, I2, F2, W2, L5, M5, P7, A7, S8, P8, A8, G8, I10, A10, L10, V10, L11, and I11. Southwood et al. (1998) used the polynomial method (Gulukota et al., 1997) to investigate DR4 binding. Average Relative Binding (ARB) values for each residue position were estimated from a library of 384 peptides with known binding affinities. For prediction purposes, ARBs of a given sequence were multiplied together and classified as binding or non-binding according to two different thresholds. An independent set of 50 peptides that had not been utilized in the derivation of the algorithm was used to examine predictive capacity. In both instances 80% were correctly classified. When the previously mentioned evolutionary algorithm and artificial neural network developed by Brusic et al. (1998) was used to predict four levels of binding (high, moderate, low, and non-binding), the accuracy was 57% on the original dataset and 53% on the validating dataset. In enumerating the difficulties of using a large database comprised of entries from many sources to predict multilevels of DR binding, they list ‘the range of experimental methods for assaying of peptide binding’ and ‘the experimental and reporting errors’. These comments are relevant to the current study. Thus, the iterative SDA meta-algorithm discussed here has been shown to produce class II MHC multi-level binding motifs in agreement with other studies. The level of accuracy, 87–93% is competitive. Major advantages of this method are (i) it deals with alignment and motif extraction simultaneously, and (ii) it allows for different binding patterns among levels. Implementation opportunities will increase as more large databases become standardized and available.

REFERENCES Afifi,A. and Clark,V. (1990) Computer-aided Multivariate Analysis. Van Nostrand Reinhold, New York. Brown,J., Jardetzky,T., Gorga,J., Stern,L., Urban,R., Strominger,J. and Wiley,D. (1993) Three-dimensional structure of the human class II histocompatibility antigen HLA-DR1. Nature, 364, 33– 39. Brusic,V., Rudy,R., Kyne,A. and Harrison,L. (1997) MHCPEP, a database of MHC-binding peptides: update. http://wehih.wehi. edu.au/mhcpep/. Brusic,V., Rudy,R., Honeyman,M., Hammer,J. and Harrison,L. (1998) Prediction of MHC class II-binding peptides using an evolutionary algorithm and artificial neural network. Bioinformatics, 14, 121–130. Chicz,R., Urban,R., Lane,W., Gorga,J., Stern,L., Vignali,D. and Strominger,J. (1992) Predominant naturally processed peptides bound to HLA-DR1 are derived from MHC-related molecules and are heterogeneous in size. Nature, 358, 764–768.

947

R.R.Mallios

Dixon,W.J., Brown,M.B., Engleman,L. and Jennrich,R.I. (1990) BMDP Statistical Software. BMDP Statistical Software, Los Angeles. Falk,K., Rotzchke,O., Stevanovic,S., Jung,G. and Rammensee,H. (1994) Pool sequencing of natural HLA-DR, DQ, and DP ligands reveals detailed peptide motifs, constraints of processing, and general rules. Immuno-genetics, 39, 230–242. Fleckenstein,B., Kalbacher,H., Muller,C., Stoll,D., Halder,T., Jung,G. and Weismuller,K. (1996) New ligands binding to the human leukocyte antigen class II molecule DRB1*0101 based on the activity pattern of an undecapeptide library. Eur. J. Biochem., 240, 71–77. Gulukota,K., Sidney,J., Sette,A. and DeLisi,C. (1997) Two complementary methods for predicting peptides binding major histocompatibility complex molecules. J. Mol. Biol., 267, 1258–1267. Hammer,J., Takacs,B. and Sinigaglia,F. (1992) Identification of a motif for HLA-DR1 binding peptides using M13 display libraries. J. Exp. Med., 176, 1007–1013. Hammer,J., Valsasnini,P., Tolba,K., Bolin,D., Higelin,J., Takacs,B. and Sinigaglia,F. (1993) Promiscuous and allele-specific anchors in HLA-DR-binding peptides. Cell, 74, 197–203. Mallios,R.R. (1999) Class II MHC quantitative binding motifs derived from a large molecular database with a versatile iterative

948

stepwise discriminant analysis meta-algorithm. Bioinformatics, 15, 432–439. Marshall,K. W., Wilson,K.J., Liang,J., Woods,A., Zaller,D. and Rothbard,J.B. (1995) Prediction of peptide affinity to HLA DRB1*0401. J. Immunol., 154, 5927–5933. O’Sullivan,D., Sidney,J., Apella,E., Walker,L., Phillips,L., Colon,S., Miles,C., Chesnut,R. and Sette,A. (1990) Characterization of the specificity of peptide binding to four DR haplotypes. J. Immunol., 145, 1799–1808. O’Sullivan,D., Sidney,J., Del Guercio,M., Colon,S. and Sette,A. (1991a) Truncation analysis of several DR binding epitopes. J. Immunol., 146, 1240–1246. O’Sullivan,D., Arrhenius,T., Sidney,J., Del Guercio,M., Albertson,M., Wall,M., Oseroff,C., Southwood,S., Colon,S., Gaeta,F. and Sette,A. (1991b) On the interaction of promiscuous antigenic peptides with different DR alleles. J. Immunol., 147, 2663–2669. Southwood,S., Sidney,J., Kondo,A., del Guercio,M., Apella,E., Hoffman,S., Kubo,R., Chestnut,R., Grey,H. and Sette,A. (1998) Several common HLA-DR types share largely overlapping peptide binding repertoires. J. Immunol., 160, 3363–3373. Stern,L., Brown,J., Jardetzky,T., Gorga,J., Urban,R., Strominger,J. and Wiley,D. (1994) Crystal structure of the human class II MHC protein HLA-DR1 complexed with an influenza virus peptide. Nature, 368, 215–221.

Suggest Documents