An Improved Model for Prediction of Retention Times of ... - CiteSeerX

19 downloads 0 Views 274KB Size Report
Glyceraldehyde-3-phosphate dehydrogenase (rabbit). 35.9. 39% (13). 31% (9) ..... acid and 5-oxo-thiomorpholine-3-carboxylic acid, respec- tively (48, 49).
Research

An Improved Model for Prediction of Retention Times of Tryptic Peptides in Ion Pair Reversed-phase HPLC ITS APPLICATION TO PROTEIN PEPTIDE MAPPING BY OFF-LINE HPLC-MALDI MS*

O. V. Krokhin‡§¶, R. Craig§, V. Spicer‡, W. Ens‡§, K. G. Standing‡§, R. C. Beavis§, and J. A. Wilkins§ The proposed model is based on the measurement of the retention times of 346 tryptic peptides in the 560- to 4,000-Da mass range, derived from a mixture of 17 protein digests. These peptides were measured in HPLC-MALDI MS runs, with peptide identities confirmed by MS/MS. The model relies on summation of the retention coefficients of the individual amino acids, as in previous approaches, but additional terms are introduced that depend on the retention coefficients for amino acids at the N-terminal of the peptide. In the 17-protein mixture, optimization of two sets of coefficients, along with additional compensation for peptide length and hydrophobicity, yielded a linear dependence of retention time on hydrophobicity, with an R2 value about 0.94. The predictive capability of the model was used to distinguish peptides with close m/z values and for detailed peptide mapping of selected proteins. Its applicability was tested on columns of different sizes, from nano- to narrow-bore, and for direct sample injection, or injection via a pre-column. It can be used for accurate prediction of retention times for tryptic peptides on reversed-phase (300-Å pore size) columns of different sizes with a linear water-ACN gradient and with TFA as the ion-pairing modifier. Molecular & Cellular Proteomics 3:908 –919, 2004.

The application of MS to biomolecular analysis has revolutionized protein research within the past decade (1). This can be mostly attributed to the development of ionization techniques that are compatible with biomolecules, i.e. MALDI (2, 3) and ESI (4), as well as improved instrumentation. However, although modern mass spectrometers provide high mass accuracy and sensitivity, the protein complexity and concentration range usually found in biological samples still present a challenge. The problem has been traditionally attacked by separation of complex protein mixtures by two-dimensional From the ‡Department of Physics and Astronomy, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada; and §Manitoba Centre for Proteomics, University of Manitoba, Winnipeg, MB, R3E 3P4, Canada Received, February 26, 2004, and in revised form, July 6, 2004 Published, MCP Papers in Press, July 6, 2004, DOI 10.1074/mcp.M400031-MCP200

908

Molecular & Cellular Proteomics 3.9

gel electrophoresis, with subsequent protein in-gel digestion, followed by ESI or MALDI MS. This remains one of the most popular sample preparation procedures, especially suitable for protein identification and quantitation. However, the method is best suited for higher abundance proteins with masses greater than 12–14 kDa, and some categories of molecules, such as membrane proteins (1) or species with extremes in isoelectric points, are handled poorly. There are also difficulties in adapting the method to high-throughput applications. Alternative analytical approaches are based on pre-fractionation of protein mixtures or cell lysates before the final MS steps of analysis (5–9). This often involves proteolytic digestion, followed by one- or multi-dimensional chromatographic separation of the resulting peptides, with subsequent detection by MS/MS. Such a method may yield considerable simplification of the problem, because the fractions from on- or off-line HPLC separations have reduced complexity compared with the original sample. Indeed, the combination of HPLC-ESI (MS or MS/MS) has proved to be a “work horse” for large-scale high-throughput proteomics (9, 10), because of its ability to deal with complex samples and to be fully automated (11–13). However, the optimum conditions for operation of the HPLC and the mass spectrometer are usually different, so on-line coupling of the HPLC to the mass spectrometer, as commonly used for ESI, may require undesirable compromises for both approaches, as well as requiring strict time constraints. In contrast, the simplest mode of coupling HPLC to a MALDI system is off-line, which has the advantage of completely decoupling the two techniques, enabling separate optimization of each, and removing any time constraints on the mass spectrometric measurements. The added capability to archive (14) samples in conjunction with advanced MALDI MS technologies (15–19) makes the HPLC-MALDI MS (MS/MS) combination attractive for detailed studies of protein sequences, particularly those that include post-translational modifications (20, 21). High-efficiency capillary electrophoresis separations with peak widths of a few seconds require the use of continuous trace vacuum deposition (22, 23), and a similar deposition

© 2004 by The American Society for Biochemistry and Molecular Biology, Inc. This paper is available on line at http://www.mcponline.org

Prediction of Retention Times of Peptides in RP HPLC

system has been used successfully for off-line coupling of nano- and capillary LC to the mass spectrometer (24, 25). However, such a complicated system is usually unnecessary in the latter case, because chromatographic peaks often have an average duration of 20 –30 s at half-height. Under such conditions, off-line coupling of liquid-phase separations to MALDI MS can be based on ordinary fraction collection and does not require serious instrumental modifications (26 –28). Alternatively, the HPLC column effluent can be sprayed (29), dispensed (30, 31), or simply deposited (32) on a MALDI target in air using a three-dimensional deposition device, thus producing a suitable MALDI target in a single step. In these applications, HPLC has normally been used simply as a separation device, without considering the additional information that might be derived from the chromatographic retention times. Although the resolving capabilities of HPLC and modern MS are not comparable, such information can in fact be used to differentiate between peptides, such as isomers, that are otherwise indistinguishable even by high-accuracy mass measurement (33). More generally, the inclusion of this type of information can increase the confidence of protein identification by peptide mass fingerprinting. There have been several predictive approaches of peptide retention times in reversed-phase (RP)1 HPLC. These models are based on the general assumption that the chromatographic behavior of a peptide is mainly dependent on its amino acid composition (34). In several cases, sets of retention coefficients for individual amino acids were generated from computer-calculated regression analyses of the retention times of peptides of varying composition (35–39). Another method for determining the contribution of individual amino acid residues was based on measurement of the retention times of model synthetic peptides, for example using Ac-GlyX-X-(Leu)3-(Lys)2-amides, where position X was substituted by each of the 20 amino acids found in proteins (40, 41). Later, the same group introduced a logarithmic correction factor for accurate prediction of retention time of peptides having more than 20 amino acids (42). Such algorithms have reported high correlations (⬃0.99) for the peptides used to generate them, but it is not clear how effective they would be for other samples. In fact, the recent application of a combined prediction model to a mixture of tryptic peptides from commercially available proteins revealed a considerably lower predictive potential (43). An alternative approach for generating large lists of peptides and their retention times employs MS/MS to sequence sets of HPLC-separated peptides derived from protein digests. A dataset (⬃7,000 identified peptides) was used for optimization of a model of retention using an artificial neural 1

The abbreviations used are: RP, reversed-phase; ANN, artificial neural network; PNGase F, peptide N-glycosidase F; DHB, 2,5-dihydroxybenzoic acid; ID, inner diameter; ABRF, Association of Biomolecular Resource Facilities; PRG, Proteomics Research Group.

network (ANN) (33). The large amount of experimental data provided by this approach produced a more accurate determination of retention coefficients for individual amino acids. We believe, however, that a predictive model can be improved significantly by the introduction of sequence-specific correction factors, taking account of the distribution of amino acids along the peptide chain, as well as the overall amino acid composition. This article describes a series of correction factors for more accurate prediction of retention times for peptides during RP HPLC. This model was created based on analysis of a single ␮-HPLC-MALDI MS run on a 17-protein tryptic digest mixture containing 346 peptides in the 560- to 4,000-Da mass range. EXPERIMENTAL PROCEDURES

Choice of Experimental Conditions—The intent of the study was to develop a general model that could predict peptide elution patterns from C18 columns. A linear gradient was used for the peptide elution, as there is a strong linear correlation between peptide retention times and peptide hydrophobicity (34). The experimental conditions were selected to be as broadly applicable as possible. Furthermore, the use of off-line separation and subsequent MALDI MS permitted optimal ion pairing with TFA without the risk of subsequent interference in the ionization process. The choice of the pool of peptides used to develop a model is critical for the evaluation of parameters influencing retention times. Ideally, peptides containing all 20 amino acid residues spanning a range of sizes and hydrophobicities should be well represented. The approach taken in the present studies was to produce an equimolar (2 pmol each per injection) mixture of the tryptic digests of 17 proteins (Table I). The sequences of all of these proteins were known, thus simplifying peptide identification and offering a reasonably diverse sampling of peptides. However, it is noteworthy that all of the proteins (except for the ␣5 and ␤1 integrin chains) are soluble proteins, and as such one might anticipate that the proportion of hydrophobic peptides is under-represented in this mixture. Preparation of Digests for HPLC-MS Analysis—Unless otherwise noted, the same in-solution digestion protocol was used in all experiments. Proteins were reduced (10 mM DTT, 30 min, 57 °C), alkylated (50 mM iodoacetamide, 30 min in the dark at room temperature), dialyzed against 100 mM NH4HCO3 (6 h, 7-kDa molecular mass cutoff; Pierce, Rockford, IL), and digested overnight with (sequencing-grade) modified trypsin (1/100 enzyme/substrate weight ratio, 12 h, 37 °C; Promega, Madison, WI). The human ␣5␤1 integrin samples were processed with and without N-deglycosylation with peptide N-glycosidase F (PNGase F; Roche Molecular Biochemicals, Indianapolis, IN). Deglycosylation of ␣5␤1 integrin was performed between the dialysis and trypsinization steps (0.1 U/␮l PNGase F, overnight 37 °C). For this study, we performed ␮-HPLC-MALDI MS analysis on several different samples. First, a 17-component protein digest mixture was prepared from 15 commercially available proteins, plus the ␣ and ␤ chains of human ␣5␤1 integrin. The mixture was made by separately digesting 1 mg/ml solutions of the 15 proteins (Table I) in 100 mM ammonium bicarbonate; a digest of affinity-purified nondeglycosylated human ␣5␤1 integrin (⬃300 ␮g/ml) was also included (44). Digests of each protein were analyzed separately by MALDI MS to confirm protein identity. A mixture of these 17-protein digests (0.4 pmol/␮l of each) was prepared by appropriate dilution in 0.2% TFA water solution. Five microliters of the mixture (2 pmol of each protein) was injected into the ␮-HPLC system. Second, a tryptic digest of deglycosylated ␣5␤1 integrin was acidified with 0.5% TFA (final concentration), and 5 ␮l of ⬃300 ␮g/ml digest was analyzed. Third,

Molecular & Cellular Proteomics 3.9

909

Prediction of Retention Times of Peptides in RP HPLC

lyophilized ABRF PRG03 sample (tryptic digest of bovine protein disulfide isomerase) was resuspended in 0.2% formic acid and injected (0.5 pmol in 5 ␮l) into the nano-LC system (20). Fourth, an SDS-PAGE-separated sample of SARS virus spike glycoprotein (21) was digested in-gel with trypsin according to the procedure described by Shevchenko et al. (45). The peptides were extracted from the gel, lyophilized, and resuspended in 5 ␮l of 0.2% TFA prior to ␮-HPLC-MALDI MS analysis. Fifth, a tryptic digest was prepared from the total urinary proteins of a healthy male donor. Fresh urine was centrifuged at 10,000 rpm for 5 min to remove cell debris, and the supernatant was dialyzed against 100 mM NH4HCO3 for 4 h using 7-kDa molecular mass cutoff dialysis tubes from Pierce. A digest was prepared from 100 ␮l of urine as described for the protein mixture. The digest was lyophilized and resuspended in 10 ␮l of 0.2% TFA. A 5-␮l aliquot of the resulting peptide mixture was subjected to ␮-HPLC-MALDI MS analysis. Chromatography and Fraction Collection—Deionized (18 M⍀) water and HPLC-grade ACN were used for the preparation of eluents. Column temperature was maintained at 30 °C throughout all experiments. Chromatographic separations were performed using three different HPLC configurations: 1.

A micro-Agilent 1100 Series system (Agilent Technologies, Wilmington, DE) operated in micro-flow mode was used for analysis of the majority of samples described here. Samples (5 ␮l) were injected directly onto a 150-␮m ⫻ 150-mm column (Vydac 218 TP C18, 5 ␮m; Grace Vydac, Hesperia, CA) and eluted with a linear gradient of 1– 80% ACN (0.1% TFA) in 60 min or 1.32% ACN per minute at 4 ␮l/min flow rate. UV absorbance was monitored at 214 nm using a 500-nl flow cell. PEEK 65-␮m inner diameter (ID) and fused silica 50-␮m ID tubings were used for pre- and post-column liquid connections. The column effluent (4 ␮l/min) was mixed on-line with 2,5-dihydroxybenzoic acid (DHB) MALDI matrix solution (0.5 ␮l/min) and deposited by a computer-controlled robot (32) onto a movable gold target at 1-min intervals. Microtee P775 (Upchurch Scientific, Oak Harbor, WA) was used for on-line mixing in the micro-flow version. Forty fractions were collected, as the majority of tryptic peptides were eluted in 40 min under the chromatographic conditions used. Fractions were air-dried and subjected to MALDI MS analysis. In order to estimate applicability of the model to different C18 sorbents, Zorbax 300SB-C18 (Agilent Technologies) packing material (150-␮m ⫻ 150-mm column) was used for separation of the same sample mixture under separation conditions identical to those described above. In one series of experiments, a precolumn (300 ␮m ⫻ 50 mm, Vydac 218 TP C18, 5 ␮m) was loaded with a 50-␮l volume of the protein mixture containing a total of 2 pmol of each protein at flow rate of 30 ␮l/min in 1% ACN, 0.1% TFA-water carrier solution. The sample was analyzed using a 1– 80% ACN gradient over 60 min. 2. A micro-Agilent 1100 Series system operated in normal-flow mode was used for separation of the 17-protein digest mixture on a narrow-bore column. Sample (20 ␮l, 8 pmol of each protein) was injected onto a 1-mm ⫻ 100-mm column (Vydac 218 TP C18, 5 ␮m) and eluted with a linear gradient of 1– 80% ACN (0.1% TFA) in 60 min at 100 ␮l/min flow rate. UV absorbance was monitored at 214 nm using a 1-␮l flow cell. In this case, 1-min fractions were collected and lyophilized. Later they were resuspended in 0.2% TFA, then deposited manually on a gold target and analyzed by MALDI MS. 3. A nano-flow version of the Ultimate system (LC Packings, San Francisco, CA) was used for analysis of the ABRF PRG03 sample (20). Five microliters of the sample containing 0.5 pmol of tryptic digest of bovine protein disulfide isomerase was

910

Molecular & Cellular Proteomics 3.9

injected onto a 300-␮m ⫻ 5-mm C18PM column (LC Packings) and separated on a 75-␮m ⫻ 250-mm (Vydac 218 TP C18, 3 ␮m) column with a linear gradient of 1– 80% ACN (0.1% TFA) in 75 min. Fused silica 20-␮m ID tubings were used for liquid connections. The column effluent (200 nl/min) was mixed online (P773 nano Y connector; Upchurch Scientific) with DHB matrix solution (50 nl/min) and deposited onto a movable gold target at 1-min intervals. We used fraction number as a measure of peptide retention time. If the full intensity of a peak was contained in a single fraction, the peak was assigned a retention time equal to the fraction number. However, if that peak’s signal was distributed between two consecutive fractions, the assigned retention time was the intensity-weighted average of the fraction numbers. More than 95% of all peptides identified in the 1.32% per minute gradient run were found in one or two fractions, but there were a few cases of peak tailing. In these cases, only the two fractions that showed the most intense signal for a given peptide were taken into account. TOF MS—The identity of each of the digested proteins was confirmed by peptide mass fingerprinting. Each protein digest was mixed 1:1 with matrix solution (150 mg/ml DHB in 1:1 water:ACN), deposited on a gold-plated MALDI target, air dried, and subjected to MALDI MS analysis. The spots from each individual digest as well as the chromatographic fractions were analyzed by MS with m/z range 560 –5000, and by MS/MS in the Manitoba/Sciex prototype QqTOF mass spectrometer (Sciex, Thornhill, ON, Canada) (15). In this instrument, ions are produced by irradiation of the sample with photon pulses from a 20-Hz nitrogen laser (VCL 337ND; Spectra-Physics, Mountain View, CA) with 300 mJ energy per pulse. Orthogonal injection of ions from the quadrupole into the TOF section normally produces a mass resolving power of ⬃10,000 FWHM and accuracy within a few mDa in the TOF spectra in both MS and MS/MS modes. Software and Programming—M/z, ProFound, and PepMap programs (Manitoba Centre for Proteomics, www.proteome.ca) were used for peak assignment, peptide mass fingerprinting, and protein peptide mapping, respectively. A signal-to-noise ratio of 2.5 was used for automatic peak assignment by M/z. PepMap compares protein sequence with a submitted set of m/z values yielding sequence coverage information. The software for calculation of the hydrophobicity of each peptide was developed using Perl, a code developed on an apple G3 iMac computer using Mac Perl 5.6 (www.macperl.com) for MacOS 9. Additional work was also done using the Perl 5.6 component of MacOS 10.2.5 (www.apple.com), but the code should be completely portable to any platform supporting Perl. Perl has many advantages for bioinfomatic applications due to its extreme portability between platforms, loose variable typing, associative arrays, and fast edit-execute cycles. RESULTS AND DISCUSSION

A chromatogram of the separated 17-protein digest mixture is shown in Fig. 1. A total of 446 peptides were tentatively identified in 40 fractions by their masses as determined by MALDI MS (20 ppm tolerance). There were 112, 267, and 67 of these peptides in the m/z 560 –1,000, 1,000 –2,000, and 2,000 – 4,000 ranges, respectively. The identities of 378 of these fragments were confirmed by MS/MS measurements (Table I). Only these were used for the development of our model. Fig. 2 illustrates spectra of the DLLFR peptide from human/ bovine apo-transferrin as an example of MS/MS identification.

Prediction of Retention Times of Peptides in RP HPLC

FIG. 1. Chromatogram of ␮-HPLC separation of the 17-protein digest mixture with 1.32% ACN per minute gradient.

TABLE I MALDI MS analysis of the HPLC fractions of the 17-protein mixture Protein sequence coverage and number of peptides matched are shown. Protein (species) Integrin ␣5 chain (human) Integrin ␤1 chain (human) Apo-transferrin (bovine) Apo-transferrin (human) Phosphorylase B (rabbit) Serum albumin (bovine) Serum albumin (human) Catalaze (bovine) Pepsinogen (porcine) Glyceraldehyde-3-phosphate dehydrogenase (rabbit) Carbonic anhydrase (bovine) Chymotrypsinogen (bovine) ␤-Lactoglobulin (bovine) Heart myoglobin (horse) ␣-Lactalbumin (bovine) Ribonuclease A (bovine) Heart cytochrome C (horse)

Model Development—According to Guo et al. (40), the predicted retention time (␶) for a gradient of 1% ACN per minute equals the sum of the retention coefficients (⌺Rc) for the amino acid residues and end groups, plus the time for elution of non-retained compounds (t0) and the time correction for a peptide standard (ts):

␶ ⫽ ⌺Rc ⫹ t0 ⫹ ts The value of ts was obtained from the observed retention time of a standard peptide: ts ⫽ (tR)std – (⌺Rc ⫹ t0) where (t0 ⫹ ts) is a gradient delay time that corresponds to the time needed for the first portion of the eluent (1% ACN in our case) to reach the detector. The t0 term corresponds to the dead volume of the HPLC system (i.e. the post-injector dead volume of the column and tubing); ts is the dead volume of the

Mr (kDa) 115.7 96.2 79.9 79.3 91.7 71.3 68.4 57.8 39.9 35.9 29.0 26.2 18.55 16.9 14.6 14.0 11.7

Sequence coverage (no. of peptides identified) Found by MS

Confirmed by MS/MS

38% (31) 87% (56) 81% (54) 77% (56) 43% (30) 76% (48) 86% (52) 69% (35) 5% (3) 39% (13) 67% (15) 38% (8) 60% (9) 81% (12) 40% (8) 76% (6) 63% (10)

34% (26) 53% (45) 66% (41) 73% (53) 35% (24) 71% (44) 83% (49) 60% (27) 5% (3) 31% (9) 61% (13) 38% (7) 55% (8) 77% (9) 35% (6) 72% (5) 59% (9)

system from the gradient mixer to sample injector. Depending on the position of the mixer (before the pumps for low-pressure gradient systems or after the pumps for high-pressure ones), the primary flow, the splitting ratio, and the dead volume of the system from eluent mixer to sample injector, the corresponding gradient delay time for a typical capillary LC separation can be 50 min or greater (46). The (t0 ⫹ ts) value can be estimated in a gradient system by providing a onestep gradient with a momentary increase of ACN concentration at the same time as sample injection. We found (t0 ⫹ ts) ⫽ 10.0 ⫾ 0.5 min for the ␮-HPLC system used in this study. All peptides eluting earlier than (t0 ⫹ ts) can be divided into two groups: non-retained analytes (eluted at t0), and analytes eluted in isocratic conditions at 1% ACN in the eluent (eluted between t0 and t0 ⫹ ts). However, because the latter category of peptides was eluted under non-gradient conditions, their retention times could not be predicted by the model. There-

Molecular & Cellular Proteomics 3.9

911

Prediction of Retention Times of Peptides in RP HPLC

TABLE II Comparison of retention coefficients for individual amino acid residues from different models and their applicability for the set of 225 peptides used in present study Amino acid Trp Phe Leu Ile Met Val Tyr Ala Thr Pro Glu Asp Cys* Ser Gln Gly Asn Arg His Lys R2

Retention coefficient (Rc ) Present study Rc (RcNt) 11.0 10.5 9.6 8.4 5.8 5.0 4.0 0.8 0.4 0.2 0.0 –0.5 –0.8* –0.8 –0.9 –0.9 –1.2 –1.3 –1.3 –1.9 0.87

(–4.0) (–7.0) (–9.0) (–8.0) (–5.5) (–5.5) (–3.0) (–1.5) (5.0) (4.0) (7.0) (9.0) (4.0) (5.0) (1.0) (5.0) (5.0) (8.0) (4.0) (4.6) (0.91)

Guo et al. (40)

Browne et al. (37)

ANN weight (Petritis et al. (33))

8.8 8.1 8.1 7.4 5.5 5.0 4.5 2.0 0.6 2.0 1.1 0.2 2.6 –0.2 0.0 –0.2 –0.6 –0.6 –2.1 –2.1 0.81

16.3 19.2 20 6.6 5.6 3.5 5.6 7.3 0.8 5.1 –7.1 –2.9 –9.2 –4.1 –0.3 –1.2 –5.7 –3.6 –2.1 –3.7 0.68

2.27 3.37 6.12 2.37 1.63 1.63 0.72 0.71 0.18 0.48 0.56 0.18 0.32 –0.35 –0.3 –0.21 –0.29 –0.24 –0.59 –0.55 0.77

* Note that the retention coefficient for Cys in the present work corresponds to carboxamidomethylated cysteine. Therefore, only 225 peptides were used for comparison of the R2 values calculated according to different models. FIG. 2. Representative (a) MALDI MS of non-separated 17-protein digest (note position of peak 663 m/z); (b) mass spectrum of fraction 21 after ␮-HPLC fractionation; (c) MS/MS spectrum of 663 m/z from fraction 21 corresponding to the DLLFR peptide.

fore, all peptides with retention times less than (t0 ⫹ ts) ⬃ 10.0 min for our system (32 out of 378) were excluded from the analysis. An Initial Model Based on the Summation of Retention Coefficients of Individual Amino Acids; Its Evaluation for Vydac 218 TP C18 Sorbent—The model proposed by Guo et al. (40) was chosen as a starting point for our optimization. This model is based on the measurement of actual retention times of a collection of synthetic peptides, which in our opinion represents a more general approach than models based solely on calculated regression analyses. These authors also claimed applicability of the model to different RP sorbents with 300 Å pore size, the same as the Vydac 218 TP C18 used in our studies. Accordingly, we first calculated the overall hydrophobicity of each peptide as the sum of the retention coefficients for the individual amino acids: H ⫽ ⌺Rc, using the Rc values determined by Guo et al. (40). We then optimized separately these individual retention coefficients Rc for all 20 amino acid residues, choosing as the optimization criterion the R2 value for the plot of retention time versus hydrophobicity for our own measurements on the 346 peptides on the Vydac 218 TP C18 sorbent. The values of the retention coef-

912

Molecular & Cellular Proteomics 3.9

ficients optimized in this way (Table II) were found to be very similar to those determined by Guo et al. (40), as expected. However, our R2 value (0.87) was somewhat greater than theirs (0.81), probably because all the peptides in our dataset contained N-terminal amino and C-terminal carboxyl groups, whereas their model was created from a limited number of synthesized N-terminal acetylated and C-terminal amide peptides. Although we believe that retention coefficients of individual amino acids depend on the type of sorbent used, we also made a comparison with other models as they applied to a subset of our peptides. In this case, all peptides containing Cys residues were excluded because we used alkylation with iodoacetamide to protect reduced cysteines, leaving only 225 peptides. Here only models proposed for TFA/water/ACN were compared to minimize the influence of the ion-pairing agent (Table II). Factors Causing Deviations from the Initial Model—We then inspected the cases that showed deviations of more than 1 min in the measured retention times from the values predicted by the initial model. This led to the following observations: 1.

Positive deviations (i.e. cases where the peptide eluted later than predicted) were characteristic of peptides carrying hydrophilic amino acid residues at the N terminus. 2. Reverse (i.e. negative) shifts were mostly found for peptides having hydrophobic residues at the N terminus.

Prediction of Retention Times of Peptides in RP HPLC

H ⫽ ⌺Rc ⫹ 0.5R1cNt ⫹ 0.3R2cNt ⫹ 0.1R3cNt

FIG. 3. Peptide retention in RP ion pair HPLC. a, “screening” effect of ion-pairing agent on interaction of N-terminal amino acid residues with C18 phase. b and c, the influence of acidic amino acid residues on chromatographic retention.

3.

The largest positive shifts were observed with peptides featuring moderate or high hydrophobicity and acidic amino acid residues (Asp, Glu) near the N terminus. 4. The retention time predictions were most accurate for proteolytic fragments of 10 –20 amino acids, whereas very small and very large ones exhibited negative deviations. 5. Very hydrophobic peptides, independent of their size, eluted from the column earlier than predicted. Favored models for the ion pair separation mechanism involve either formation of ion pairs in the mobile-phase followed by retention of ion associates on the RP column or a dynamic ion-exchange event in which the ion-paring reagent is first absorbed on the hydrophobic surface of the sorbent followed by solute molecule participation in an ion-exchange reaction. We used the first model to explain some of our findings. Anionic counterions of TFA interact with protonated basic residues of peptides, forming ion pairs (Fig. 3a). All proteolytic peptides carry positively charged N-terminal amino groups under the pH 2.0 separation conditions used in our separations. Therefore, amino acids at the N-terminal of the peptide will be partially shielded from the interaction with the hydrophobic surface of the sorbent. This causes a negative shift in retention time for peptides with a hydrophobic N terminus and positive deviation for those with a hydrophilic one. To compensate for such an influence, a second set of retention coefficients (RcNt) was introduced along with weight coefficients reflecting the influence of distance from the N terminus. Initial values of RcNt for optimization were calculated by subtracting the retention coefficients of individual amino acids from the average value of Rc of all 20 residues: RXcNt ⫽ (⌺Rc/20) – RXc For a first estimate, weighted coefficients were arbitrarily taken as 0.5, 0.3, and 0.1 for the first, second, and third amino acid from the N terminal, respectively, and the hydrophobicities of the peptides were calculated:

The individual N-terminal retention coefficients and then the weight coefficients were optimized separately to provide a maximal R2 value. Optimal values for these coefficients were found to be very close to those initially chosen prior the optimization (0.42, 0.22, and 0.05, respectively). The values of the N-terminal retention coefficients for each amino acid are shown in Table II. As expected, negative values were found for hydrophobic amino acids and positive ones for hydrophilic residues. Unusually high RcNt values were found for Asp and Glu residues of moderate hydrophobicity (Table II). This is a consequence of the acidic nature of these amino acids that influences the basic properties of the N-terminal amino groups. This effect decreases ion pair formation and changes the chromatographic behavior of the solute in the RP separation system. A smaller ion pair (compare Fig. 4, b and c) is able to come closer to the C18 surface and to expose its hydrophobic residues for interaction. A similar effect causes an increase in retention time for peptides blocked N-terminal by acetylation (35, 37, 40). Thus, ion pair formation affects hydrophobicity of amino acid residues at the N-terminal and conversely, those amino acids may alter retention by changing the ion-pairing ability of a peptide. Introduction of the second set of retention coefficients increased the R2 value from 0.87 to 0.91. We believe that the set of 346 peptides provides sufficient statistics for correct assignment of the retention coefficients Rc, as even the least abundant residues (Trp and Met) were each represented 55 times. Even in the first three N-terminal positions, there were 1,038 residues represented in our analysis: 62 Gly, 87 Ala, 50 Ser, 56 Pro, 83 Val, 62 Thr, 58 Cys, 48 Ile, 92 Leu, 46 Asn, 68 Asp, 43 Gln, 33 Lys, 69 Glu, 15 Met, 36 His, 45 Phe, 14 Arg, 55 Tyr, and 16 Trp. It would be useful to increase the amount of data in order to calculate a more accurate estimate of RcNt for Met, Arg, and Trp. Nevertheless, the 16 tryptophan cases observed were enough to reflect the influence of the amino acid size. As shown in Table II, Leu and Ile had the lowest RcNt values, whereas Trp and Phe exhibited the highest retention coefficients Rc. The latter deviation is the result of the larger size and the consequent smaller “screening effect” of TFA counterions for the Phe and especially the Trp residue. In addition, the retention time of a peptide is partially dependent on its polypeptide chain length. Mant et al. (42) found that the accuracy of their peptide retention time prediction decreases significantly beyond about 20 residues. Guo et al. (40) demonstrated a linear relationship between predicted (␶) minus observed (tobsR) retention time versus the product of peptide hydrophobicity (⌺Rc) and logarithm of the number of residues. Using this approach, retention times of peptides up to 50 residues in length can be predicted accurately. For our set of the peptides, we introduce a correction coefficient (KL)

Molecular & Cellular Proteomics 3.9

913

Prediction of Retention Times of Peptides in RP HPLC

FIG. 4. Retention time versus hydrophobicity plots for chromatographic separations in different conditions. a, 17-protein digest mixture with 1.32% ACN per minute gradient; b, 17-protein digest mixture with 0.66% ACN per minute gradient; c, separation of the 17-protein digest mixture on narrow-bore column with 1.32% ACN per minute gradient; d, separation of tryptic peptides from deglycosylated human ␣5␤1 integrin (see Table III and “Experimental Procedures”).

reflecting the influence of peptide length (N) and another correction factor based on the hydrophobicity of the peptide: if N ⬍ 10, KL ⫽ 1– 0.027*(10 – N); if N ⬎ 20, KL ⫽ 1– 0.014*(N–20); otherwise KL ⫽ 1 if H ⬍ 38, Hfinal ⫽ H if H ⱖ 38, Hfinal ⫽ H – 0.3*(H – 38). Therefore, hydrophobicity was calculated as: H ⫽ KL*(⌺Rc ⫹ 0.42R1cNt ⫹ 0.22R2cNt ⫹ 0.05R3cNt) if H ⬍ 38 and H ⫽ KL*(⌺Rc ⫹ 0.42R1cNt ⫹ 0.22R2cNt ⫹ 0.05R3cNt) – 0.3(KL*(⌺Rc ⫹ 0.42R1cNt ⫹ 0.22R2cNt ⫹ 0.05R3cNt) – 38) if H ⱖ 38 Using this formula, an R2 value of 0.939 was obtained for the set of 346 peptides chosen for optimization (Fig. 4a).

914

Molecular & Cellular Proteomics 3.9

After model optimization, we were able to confirm the significance of our initial conclusion about the influence of Nterminal chemistry on peptide retention. Peptides having two hydrophobic amino acids (W, F, L, I, Y, M, V) at the N terminus were taken as an example. We found that 49 out of 53 peptides with a hydrophobic N terminus showed negative deviations from the predicted retention times if we excluded the N-terminal correction from the final formula. Such a dominant effect supports our initial finding. Applicability of the Model to Different Chromatographic Conditions and for Analysis of Real Samples—The model was tested for chromatographic columns of several sizes using different gradients and flow rates. Parameters of the linear equation tR ⫽ a ⫹ bH for all separations, with R2 values, are given in Table III. The 17-protein digest mixture was separated again on the same column but with a 50% shallower ACN gradient (0.66% per minute), and 80 fractions were collected. All of the peptides identified in both HPLC runs were used to plot the relationship between retention time and hydrophobicity (Fig. 4b, Table III). The slope of the graph was exactly twice the slope found in the 1.32% ACN per min gradient. A third separation was performed on a narrow-bore column

Prediction of Retention Times of Peptides in RP HPLC

TABLE III Applicability of the model for different chromatographic conditions and samples* Chromatographic conditions 1.32% per min ACN gradient 0.66% per min ACN gradient 1-mm ⫻ 100-mm column, 1.32% per min ACN gradient, 100 ␮l/min flow rate Sample injection using 300-␮m ⫻ 50-mm Vydac 218 TP C18, 5-␮m pre-column 75-␮m ⫻ 250-mm (Vydac 218 TP C18, 3 ␮m) column, sample injection using 300-␮m ⫻ 5-mm C18PM pre-column (LC Packings), 1.05% per minute ACN gradient, 200 nl/min flow rate 1.57% per min ACN gradient 1.35% per min ACN gradient 1.32% per min ACN gradient 150-␮m ⫻ 150-mm Zorbax 300SB-C18 5-␮m (Agilent) column1.32% per min ACN gradient 1.32% per min ACN gradient, 0.1% formic acid, 0.005% heptafluorobutyric acid

Sample

tR ⫽ a ⫹ bx; a, b, (R2)‡

17-protein digest 17-protein digest 17-protein digest

10.8, 0.386, 0.939 11.4, 0.773, 0.940 14.1, 0.387, 0.928

17-protein digest

11.8, 0.386, 0.932

ABRF PRG03, tryptic digest of bovine protein disulfide isomerase

17.1,0.491, 0.901

Tryptic digest of deglycosylated human ␣5␤1 integrin Tryptic digest of whole human urine (183 peptides from 30 different proteins) In-gel tryptic digest of SARS virus spike glycoprotein 17-protein digest

10.4, 0.325, 0.949 12.3, 0.374, 0.913

17-protein digest

11.5, 0.409, 0.934

10.8, 0.370, 0.921 11.7, 0.379, 0.930

* Unless otherwise noted, samples were injected directly onto a 150-␮m ⫻ 150-mm column (Vydac 218 TP C18, 5 ␮m; Vydac) and eluted with a linear gradient of 1.32% water-ACN per minute (0.1% TFA) at 4 ␮l/min flow rate. Note that post-column tubing connections were different for real samples analyzed previously. Therefore, some changes in a values may be observed. ‡ Reproducibility of determination of a and b parameters was estimated by running 10 real separations of peptide mixtures with addition of digest of standard protein (human transferrin). The plots tR ⫽ a ⫹ bx were obtained for the same set of peptides from human transferrin detected in 10 different HPLC runs. Relative standard deviations were calculated for a and b as 0.2 and 0.003, respectively.

(1-mm ⫻ 100-mm version of Vydac 218 TP C18) using a flow rate of 100 ␮l/min and a 1.32% gradient. The slopes of these curves were nearly the same for both diameter columns (0.386 versus 0.387) when identical gradient conditions were used. However, different intercepts were determined because of differences in the gradient delay times (Fig. 4c, Table III). The use of a pre-column for sample injection introduces additional dead volume to the system. Therefore, a small increase in intercept and the same slope was expected and eventually found for the separation using injection via a precolumn (Table III). Next, the algorithm was applied to the results of several previously analyzed samples. The ABRF PRG03 sample containing a tryptic digest of bovine protein disulfide isomerase was separated in October 2002 using a nano-flow HPLC setup (20). Significant peak tailing was observed in that run, indicating that more sophisticated schemes of post-column mixing of effluent and matrix solution are required for the nano-flow variant. Different packing materials in the pre- and analytical columns (see “Experimental Procedures”) also contributed to lower the R2 value to 0.901 (Table III). A tryptic digest of deglycosylated human ␣5␤1 integrin was analyzed in January 2002 during the study of its N-glycosylation patterns. This sample showed correlation R2 ⫽ 0.949 for tryptic fragments from both ␣5 and ␤1 chains (Fig. 4d, Table III). In contrast, the HPLC-MS analysis of a tryptic digest of whole human urine (December 2002) did not show as strong a correlation (R2 ⫽ 0.913), possibly due to the influence of different concentrations of separated peptides on retention times.

As a test of the model, we examined the ability to predict the retention times of 36 peptides derived from a sample of the 139-kDa SARS virus spike glycoprotein (21). Despite the very low peak intensities, which limited the number of peaks that could be identified by MS/MS, an R2 value of 0.921 was observed for the identified peptides (Table III). These results provided additional evidence for the validity of the model. It was of interest to apply our optimized model for peptide retention time prediction to different sorbents, as the applicability of some models appears to be highly sorbent specific (37, 40, 43). The 17-protein digest mixture was separated on 150-␮m ⫻ 150-mm column (Zorbax SB-300 C18, 5 ␮m; Agilent) under our standard chromatographic conditions. Both sorbents (Vydac and Zorbax) are synthesized based on ultrahigh purity, 300-Å porous-silica microspheres, with similar C18-bonded phases. Slightly lower slope, higher intercept, and 0.93 R2 value of tR ⫽ a ⫹ bH plot were observed for Zorbax SB-300 C18 packing material (Table III). A vast majority of HPLC-MS separations have been performed using an on-line combination with an ESI interface. It was interesting to apply the model for the off-line HPLCMALDI MS with the ion-pairing modifiers commonly used for ESI, so separation of the 17-protein digest was carried out under the same chromatographic conditions (150-␮m ⫻ 150-mm Vydac 218 TP C18 5-␮m column; linear gradient of 1.32% water-ACN per minute at 4 ␮l/min flow rate) except with 0.1% formic, 0.005% heptafluorobutyric acid in both solvents. The higher slope of the dependence tR ⫽ a ⫹ bH reflects the more hydrophobic character of the modifier used

Molecular & Cellular Proteomics 3.9

915

Prediction of Retention Times of Peptides in RP HPLC

TABLE IV The use of retention time prediction for HPLC-MS peptide mapping of human apo-transferrin Peptide, protein Identification of peptides with close masses SCHTAVGR (468–475)‡ SCHTGLGR (136–143) APNHAVVTR (601–609) EDLIAYLK (92–99), horse heart cytochrome C WCALSHHER (363–371) DSGFQMNQLR (123–132) SASDLTWDNLK (454–464) FKDLGEEHFK (35–44), bovine serum albumin DYELLCLDGTR (577–587) NYELLCGDNTR (245–255), bovine apo-transferrin FDEFFSEGCAPGSK (495–508) TAGWNIPMGLLYNK (476–489) VWPHGDYPLIPVGK (301–314), bovine catalaze LKPDPNTLCDEFK (139–151), bovine serum albumin False MS identification AVGNLRK (677–683) DSGFQMNQLRGK (123–134) IECVSAETTEDCIAK(385–399) Peptides not identified by MS KPLEK (163–167) AVANFFSGSCAPCADGTDFPQLCQLCPGCGCSTLNQYFGYSGAFK (168–212) IECVSAETTEDCIAK (385–399) CGLVPVLAENY关N兴K (421–433)§ KDSSLCK (509–515) CLVEK (542–546) QQQHLFGS关N兴VTDCSGNFCLFR (622–642)

关MH⫹兴* calculated (Da)

m/z measured (Da)

Hydrophobicity

Predicted ␶ (min)

Measured tR (min)

887.416 887.416 964.533 964.536 1195.543 1195.553 1249.606 1249.622 1354.631 1354.606 1577.658 1577.815 1577.848 1576.768

887.415 887.418 964.537 964.539 1195.554 1195.559 1249.609 1249.619 1345.637 1354.607 1577.679 1577.810 1577.856 1576.776

4.05 6.79 8.66 32.61 14.62 24.53 27.72 22.62 32.67 19.49 25.29 43.4 35.11 21.13

12.39 13.45 14.17 23.4 16.47 20.28 21.52 19.55 23.42 18.34 20.27 27.56 24.36 18.98

10.6 11.8 13 24 17 19 21 19 23 19 21 28 24.7 20.9

757.468 1380.669 1725.768

757.476 1380.686 1725.792

7.82 21.73 17.78

13.84 19.21 17.68

40.2 25 22

614.388 4988.098

614.395 4991.8 average 1725.768 Not found 837.429 648.341 Not found

7.23 50.29

13.6 30.22

9 29.2

17.78 30.53 6.49 9.8 41.46

17.68 20.60 13.33 14.6 26.81

18 – 15 13 –

1725.768 1476.752 837.414 648.339 2515.125

* Monoisotopic masses. Peptide sequence corresponding to transferrin (Homo sapiens, gi4557871), unless otherwise noted. § 关N兴, N-glycosylated Asn residues. ‡

in this case. However, the 0.934 R2 value obtained clearly indicates the applicability of the model for separation with different ion pair modifiers. Use of Predicted Peptide Retention Times for Protein Characterization—MS-based studies of protein structure and function often encounter challenges such as localizing and identifying “missing” peptides in a separated protein digest, or differentiating between peptides with identical or almost identical masses from different proteins. MS/MS provides the solution to many of these problems. However, we believe that information about peptide retention times derived from HPLC-MS measurements could become an important additional tool for MS-based identification. Table IV demonstrates a number of advantages of HPLC-MS and of our predictive model in peptide mapping of human apo-transferrin, one component of the 17-protein mixture: 1.

916

A number of peptides having similar m/z values, including two isobaric peptides from apo-transferrin itself, were found in the tryptic digest of the mixture (Table IV). It would not be easy to separate any of these

Molecular & Cellular Proteomics 3.9

multiples by MS, even in the highest resolution instruments available at present (47), and of course the isobaric peptides cannot be distinguished by mass measurements alone. By contrast, HPLC provides a straightforward method of separating the components, as well as correct identifications based on the predicted chromatographic retentions, including the four peptides in the 1,576 to 1,578 m/z range in Fig. 5. 2, Initial peptide mass mapping using the PepMap (hs1.proteome.ca/prowl/knexus.html) program identified 56 peptides comprising 77% of the human apotransferrin sequence. Subsequent MS/MS analysis showed that three peptides were false hits (Table IV), but in each case the predicted and observed retention times differed by more than 4 min, so such mistakes could be identified without MS/MS measurements. 3. Only seven predicted human apo-transferrin tryptic fragments in the m/z 560 –5000 range remained unidentified by the automatic search program. When predicted retention times based on the peptide sequence

Prediction of Retention Times of Peptides in RP HPLC

FIG. 5. MALDI MS and HPLC-MALDI MS detection of peptides in the 17-protein digest mixture in the 1,573- to 1,583-Da mass range. The MALDI MS spectra of an unseparated mixture (a) and selected fractions from a 1.32% (b– d) ACN per minute gradient. Four peptides from the 17-protein digest mixture were found within a 1,573- to 1,583-Da mass range. The mass spectrum of the mixture (a) is too complex to correctly assign masses of individual species. HPLC fractionation simplifies spectra (b– d) and improves signal-to-noise ratio, making it possible to correctly identify all four peptides by their m/z (for peptide identification see Table IV).

were subsequently calculated, five of these were revealed by manual inspection of the spectra in the expected fractions. Only two fragments of detectable size were missed (Table IV), both of which contain N-glycosylated Asn residues (see below). Therefore, almost 100% protein sequence coverage was found for this 79-kDa protein in the analysis of the 17-component mixture. Such a capability supports the choice of the off-line HPLC-MALDI MS combination for many proteomic tasks. Definition of Post-translational Modifications—The retention time prediction algorithm can also be used effectively for the study of protein post-translational modifications. The small hydrophilic –PO3H2 group has been shown to have only a minor effect on the chromatographic retention of peptides (37), consistent with our own observations. Thus most phosphorylated peptides should be found in the same fractions as their non-modified analogs when using 1.32% per min ACN gradient and 1-min fractions. This facilitated identification of the Ser268 phosphorylation site on bovine disulfide isomerase in the ABRF PRG03 sample (20). In this case, we found a non-phosphorylated fragment containing the potentially phosphorylated Ser268 and then manually reanalyzed the spectrum of the same fraction to find the ⫹79.96-Da ion. Retention-specific identification of glycosylated peptides is a much more ambiguous procedure. According to our unpublished data, glycosylated peptides of moderate hydrophobicity usually elute 2 min earlier than their non-glycosylated counterparts under 1.32% ACN per minute gradient conditions. A similar result was obtained by Browne et al. (37). The unmodified human apo-transferrin peptide QQQHLFGS[N]VTDCSGNFCLFR (622– 642) that remained unidentified during the peptide mapping procedure (Table IV) was predicted to be found in fractions 26 and 27, with its glycosylated analog in fractions 24 and 25. Manual inspection of spectra from these fractions then indicated the presence of characteristic triplets of peaks separated by ⬃291.1 Da (Fig. 6), the mass of a sialic acid residue. This is still not sufficient for confident identification of these peptides, because more than 30 glycosylated fragments were expected in the 17-protein tryptic digest mixture. However, in addition to the glycosylation, the peptide QQQHLFGS[N]VTDCSGNFCLFR has a further specific se-

FIG. 6. Identification of human apo-transferrin glycosylated tryptic peptide QQQHLFGS[N]VTDCSGNFCLFR (622– 642). a and b, fractions 24 and 25 from the HPLC-MALDI MS run of the 17-protein digest mixture. Carbohydrate composition was assigned as M3G2N4, M3G2N4SA, and M3G2N4SA2 for 4137.718-, 4428.837-, and 4719.923-Da peaks, respectively. Carbohydrate composition symbols: M, mannose; G, galactose; N, N-acetylglucosamine; SA, sialic acid.

quence that can be distinguished by its chromatographic behavior. N-terminal Glu and carboxamidomethylated Cys residues undergo a cyclization reaction yielding pyro-glutamic acid and 5-oxo-thiomorpholine-3-carboxylic acid, respectively (48, 49). Products of both reactions exhibit a –17.026-Da mass shift and elute later from RP columns. The degree of degradation after a 12-h digestion was ⬃51 and ⬃34% for Cys and Glu, respectively (49). Fraction 25 in Fig. 6b contains more of the –17-Da product than fraction 24. Based on the splitting ratios of peak intensities in fractions 24 and 25, the degree of conversion can be estimated at ⬃25%. Therefore, the chromatographic data suggested that the glycosylated peptides in Fig. 6 most likely have a Gln residue at the N terminus. The residual carbohydrate mass (1622.593 Da) was calculated by subtracting the calculated mass of the (622– 642) human apo-transferrin tryptic fragment (2515.125 Da) from the mass of

Molecular & Cellular Proteomics 3.9

917

Prediction of Retention Times of Peptides in RP HPLC

the 4137.718-Da glycosylated peptide shown in Fig. 6a, and the composition of the N-linked oligosaccharide was determined to be (Hex)2(HexNAc)2(Man)3(GlcNAc)2 (1622.582 Da calculated mass) (us.expasy.org/tools/glycomod/). Identity of the peptides in Fig. 6 was confirmed later by MS/MS measurements.

7.

8.

CONCLUSIONS

We have developed an improved model for prediction of retention times of tryptic peptides in ion pair RP ␮HPLC. The model was developed from a dataset of 346 peptides identified in a single HPLC-MALDI MS run. R2 values of 0.93– 0.94 were obtained for separations performed under several different chromatographic conditions (e.g. column size, different sorbents, and ion pair modifiers), indicating the general applicability of the approach to RP peptide separation. The ability to predict peptide retention times can assist detailed peptide mapping significantly, and thus increase confidence in peptide identification and protein characterization. The program that calculates peptide hydrophobicities using the algorithm described along with a set of 346 peptides used for its development are available at the Manitoba Centre for Proteomics web site (hs2.proteome.ca/SSRCalc/SSRCalc.html). We plan to extend this model to include a larger variety of stationary phases and ion pair modifiers, and to complete our library of retention coefficients to include amino acid modifications that may occur during sample preparation.

9.

10. 11.

12.

13. 14.

15.

16.

17.

Acknowledgments—We thank James McNabb and John Cortens for invaluable technical support. * This work was supported by grants from the Canadian Institutes of Health Research, from the Natural Sciences and Engineering Research Council of Canada, and from the U.S. National Institutes of Health (GM 59240). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. ¶ To whom correspondence should be addressed: Department of Physics and Astronomy, University of Manitoba, 506 Allen Building, Winnipeg, MB, R3T 2N2, Canada. Tel.: 204-474-6184; Fax: 204-4747622; E-mail: [email protected].

18.

19.

20.

REFERENCES 1. Mann, M., Hendrickson, R. C., and Pandey, A. (2001) Analysis of proteins and proteomes by mass spectrometry. Annu. Rev. Biochem. 70, 437– 473 2. Karas, M., and Hillenkamp, F. (1988) Laser desorption ionization of proteins with molecular masses exceeding 10,000 daltons. Anal. Chem. 60, 2299 –2301 3. Hillenkamp, F., Karas, M., Beavis, R. C., and Chait, B. T. (1991) Matrixassisted laser desorption/ionization mass spectrometry of biopolymers. Anal. Chem. 63, 1193A–1203A 4. Fenn, J. B., Mann, M., Meng, C. K., Wong, S. F., and Whitehouse, C. M. (1989) Electrospray ionization for mass spectrometry of large biomolecules. Science 246, 64 –71 5. Blonder, J., Goshe, M. B., Moore R. J., Pasa-Tolic, L., Masselon, C. D., Lipton M. S., and Smith R. D. (2002) Enrichment of integral membrane proteins for proteomic analysis using liquid chromatography-tandem mass spectrometry. J. Proteome Res. 1, 351–360 6. Verma, R., Chen, S., Feldman, R., Schieltz, D., Yates, J., Dohmen J., Deshaies, R. J. (2000) Proteasomal proteomics: Identification of nucle-

918

Molecular & Cellular Proteomics 3.9

21.

22.

23.

24.

25.

otide-sensitive proteasome-interacting proteins by mass spectrometric analysis of affinity-purified proteasomes. Mol. Biol. Cell. 11, 3425–3439 Optiteck, G. J., Ramirez, S. M., Jorgenson, J. W., and Moseley, M. A. I. (1998) Comprehensive two-dimensional high-performance liquid chromatography for the isolation of overexpressed proteins and proteome mapping. Anal. Biochem. 258, 349 –361 McCormack, A. L., Schieltz, D. M., Goode, B., Yang, S., Barnes, G., Drubin, D., and Yates, J. R., III (1997) Direct analysis and identification of proteins in mixtures by LC/MS/MS and database searching at the low-femtomole level. Anal. Chem. 69, 767–776 Link, A. J., Eng, J., Schieltz, D. M., Carmack, E., Mize, G. J., Morris, D. R., Garvik, B. M., and Yates, J. R., III (1999) Direct analysis of protein complexes using mass spectrometry. Nat. Biotechnol. 17, 676 – 682 Aebersold, R., and Goodlett, D. R. (2001) Mass spectrometry in proteomics. Chem. Rev. 101, 269 –295 Gygi, S. P., Rist, B., Gerber, S. A., Turecek, F., Gelb, M. H., and Aebersold, R. (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 17, 994 –999 Yao, X., Freas, A., Ramirez, J., Demirev, P. A., Fenselau, C. (2001) Proteolytic 18O labeling for comparative proteomics: Model studies with two serotypes of adenovirus. Anal. Chem. 73, 2836 –2842 Aebersold, R. (2003) A mass spectrometric journey into protein and proteome research. J. Am. Soc. Mass Spectrom. 14, 685– 695 Johnson, T., Bergquist, J., Ekman, R., Nordhoff, E., Schuerenberg, M., Kloeppel, K.-D., Mueller, M., Lehrach, H., and Gobom, J. (2001) A CEMALDI interface based on the use of prestructured sample supports. Anal. Chem. 73, 1670 –1675 Loboda, A. V., Krutchinsky, A. N., Bromirski, M., Ens, W., and Standing, K. G. (2000) A tandem quadrupole/time-of-flight mass spectrometer with a matrix-assisted laser desorption/ionization source: Design and performance. Rapid Commun. Mass Spectrom. 14, 1047–1057 Shevchenko, A., Loboda, A., Shevchenko, A., Ens, W., and Standing, K. G. (2000) MALDI quadrupole time-of-flight mass spectrometry: A powerful tool for proteomic research. Anal. Chem. 72, 2132–2141 Medzihradszky, K. F., Campbell, J. M., Baldwin, M. A., Falick, A. M., Juhasz, P., Vestal, M. L., and Burlingame, A. L. (2000) The characteristics of peptide collision-induced dissociation using a high-performance MALDI-TOF/TOF tandem mass spectrometer. Anal. Chem. 72, 552–558 Krutchinsky, A. N., Kalkum, M., and Chait, B. T. (2001) Automatic identification of proteins with a MALDI-quadrupole ion trap mass spectrometer. Anal. Chem. 73, 5066 –5077 O’Connor, P. B., and Costello, C. E. (2001) A high pressure matrix-assisted laser desorption/ionization Fourier transform mass spectrometry ion source for thermal stabilization of labile biomolecules. Rapid Commun. Mass Spectrom. 15, 1862–1868 Krokhin, O., Cheng, K., Bykova, N., Ens, W., and Standing, K. G. (2003) An (off-line HPLC)-(orthogonal injection MALDI)-(QqTOFMS) instrument particularly useful for the analysis of post-translational modifications. Examples: I. Identification of two phosphorylated sites in tryptic peptides from the ABRF PRG03 sample (bovine protein disulphide isomerase). II. De novo sequencing and analysis of post-translational modifications on SARS viral proteins. 51th ASMS Conference on Mass Spectrometry and Allied Topics, Montreal, June 8 –12 Krokhin, O., Li, Y., Andonov, A., Feldmann, H., Flick, R., Jones, S., Stroeher, U., Bastien, N., Dasuri, K. V. N., Cheng, K., Simonsen, J. N., Perreault, H., Wilkins, J., Ens, W., Plummer, F., and Standing, K. G. (2003) Mass spectrometric characterization of proteins from the SARS virus: A preliminary report. Mol. Cell. Proteomics 2, 346 –356 Preisler, J., Hu, P., Rejtar, T., and Karger B. L. (2000) Capillary electrophoresis—Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry using a vacuum deposition interface. Anal. Chem. 72, 4785– 4795 Rejtar, T., Hu, P., Juhasz, P., Campbell, J. M., Vestal, M. L., Preisler, J., and Karger, B. L. (2002) Off-line coupling of high-resolution capillary electrophoresis to MALDI-TOF and TOF/TOF MS. J. Proteome Res. 1, 171–179 Hu, P., Rejtar, T., Preisler, J., and Karger B. L. (2001) LC-MALDI/TOF MS analysis of complex peptide mixtures using vacuum deposition interface. 49th ASMS Conference on Mass Spectrometry and Allied Topics, Chicago, IL, May 27–31 Rejtar, T., Chen, H., Moskovets, E., Li, L., Andreev, V., and Karger B. L. (2003) Universal deposition device for off-line coupling of LC to MALDI

Prediction of Retention Times of Peptides in RP HPLC

26.

27.

28.

29.

30.

31.

32.

33.

34.

35.

36.

37.

MS and MS/MS. 51th ASMS Conference on Mass Spectrometry and Allied Topics, Montreal, June 8 –12 Walker, K. L., Chiu, R. W., Monnig, C. A., and Wilkins, C. L (1995). Off-line coupling of capillary electrophoresis and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Anal. Chem. 67, 4197– 4204 Hsieh, S., Dreisewerd, K., van der Schors, R. C., Jimenez, C. R., StahlZeng, J., Hillenkamp, F., Jorgenson, J. W., Geraerts, W. P. M., and Li, K. W. (1998) Separation and identification of peptides in single neurons by microcolumn liquid chromatography-matrix-assisted laser desorption/ionization time-of-flight mass spectrometry and postsource decay analysis. Anal. Chem. 70, 1847–1852 Griffin, T. J., Gygi, S. P., Rist, B., Aebersold, R., Loboda, A., Jilkine, A., Ens, W., and Standing, K. G. (2001) Quantitative proteomic analysis using a MALDI quadrupole time-of-flight mass spectrometer. Anal. Chem. 73, 978 –986 Lou, X. W., and vanDongen, J. L. (2000) Direct sample fraction deposition using electrospray in narrow-bore size-exclusion chromatography/matrix-assisted laser desorption/ionization time-of-flight mass spectrometry for polymer characterization. J. Mass Spectrom. 35, 1308 –1312 Miliotis, T., Kjellstrom, S., Nilsson, J., Laurell, T., Edholm, L. E., and MarkoVarga, G. (2000) Capillary liquid chromatography interfaced to matrixassisted laser desorption/ionization time-of-flight mass spectrometry using an on-line coupled piezoelectric flow-through microdispenser. J. Mass Spectrom. 35, 369 –377 Fitchett, J. R., Brock, A., Horn, D. M., Ericson, C., Peters, E. C., Phong, Q., Shaw, C. M. (2001) A novel low dead volume non-contact deposition device for capillary HPLC/MALDI mass spectrometry. 49th ASMS Conference on Mass Spectrometry and Allied Topics, Chicago, IL, May 27–31 Krokhin, O., Qian, Y., McNabb, J. R., Spicer, V., Ens, W., and Standing, K. G. (2002) An off-line interface for HPLC and orthogonal MALDI TOF. 50th ASMS Conference on Mass Spectrometry and Allied Topics, Orlando, FL, June 2– 6 Petritis, K., Kangas, L. J., Ferguson, P. L., Anderson, G. A., Pasa-Tolic, L., Lipton, M. S., Auberry, K. J., Strittmatter, E. F., Shen, Y., Zhao, R., and Smith, R. D. (2003) Use of artificial neural networks for the accurate prediction of peptide liquid chromatography elution times in proteome analyses. Anal. Chem. 75, 1039 –1048 Mant, C. T., and Hodges, R. S. (2002) Analytical HPLC of peptides, in HPLC of Biological Macromolecules (Gooding, K. M., and Regnier, F. E., eds) pp. 433–511, Marcel Dekker, New York Meek, J. L. (1980) Prediction of peptide retention times in high-pressure liquid chromatography on the basis of amino acid composition. Proc. Natl. Acad. Sci. U. S. A. 77, 1632–1636 Meek, J. L., and Rossetti, Z. L. (1981) Factors affecting retention and resolution of peptides in high-performance liquid chromatography J. Chromatogr. 211, 15–28 Browne, C. A., Bennett, H. P. J., and Solomon, S. (1982) The isolation of

38.

39. 40.

41.

42.

43.

44.

45.

46.

47.

48.

49.

peptides by high-performance liquid chromatography using predicted elution positions. Anal. Biochem. 124, 201–208 Sasagawa, T., Okuyama, T., and Teller D. C. (1982) Prediction of peptide retention times in reversed-phases high-performance liquid chromatography during linear gradient elution. J. Chromatogr. 240, 329 –340 Sakamoto, Y., Kawakami, N., and Sasagawa, T. (1988) Prediction of peptide retention times. J. Chromatogr. 442, 69 –79 Guo, D., Mant, C. T., Taneja, A. K., Parker, J. M. R., and Hodges, R. S. (1986). Prediction of peptide retention times in reversed-phase highperformance liquid chromatography I. Determination of retention coefficients of amino acid residues of model synthetic peptides. J. Chromatogr. 359, 499 –517 Guo, D., Mant, C. T., Taneja, A. K., and Hodges, R. S. (1986) Prediction of peptide retention times in reversed-phase high-performance liquid chromatography II. Correlation of observed and predicted peptide retention times factors and influencing the retention times of peptides. J. Chromatogr. 359, 518 –532 Mant, C. T., Burke, T. W. L., Black, J. A., and Hodges, R. S. (1988) Effect of peptide chain length on peptide retention behaviour in reversed-phase chromatography. J. Chromatogr. 458, 193–205 Palmblad, M., Ramstro¨ m, M., Markides, K. E., Håkansson, P., and Bergquist, J. (2002) Prediction of chromatographic retention and protein identification in liquid chromatography/mass spectrometry. Anal. Chem. 74, 5826 –5830 Wilkins, J. A., Li, A., Ni, H., Stupack, D. G., and Shen, C. (1996) Control of beta1 integrin function. Localization of stimulatory epitopes. J. Biol. Chem. 271, 3046 –3051 Shevchenko, A., Chernushevich, I., Ens, W., Standing, K. G., Thomson, B., Wilm, M., and Mann, M. (1997) Rapid de novo peptide sequencing by a combination of nanoelectrospray, isotopic labeling and quadrupole/ time-of-flight mass spectrometer. Rapid Commun. Mass Spectrom. 11, 1015–1024 Shen, Y., Tolic, N., Zhao, R., Pasa-Tolic, L., Li, L., Berger, S. J., Harkewicz, R., Anderson, G. A., Belov, M. E., and Smith, R. D. (2001) High-throughput proteomics using high-efficiency multiple-capillary liquid chromatography with on-line high-performance ESI FTICR mass spectrometry. Anal. Chem. 73, 3011–3021 He, F., Hendrickson, C. L., Marshall A. G. (2001) Baseline mass resolution of peptide isobars: A record for molecular mass resolution. Anal. Chem. 73, 647– 650 Geoghegan, K. F., Hoth, L. R., Tan D. H., Borzilleri, K. A., Withka, J. M., and Boyd, J. D. (2002) Cyclization of N-terminal S-carbamoylmethylcysteine causing loss of 17 Da from peptides and extra peaks in peptide maps J. Proteome Res. 1, 181–187 Krokhin, O., Ens, W., and Standing, K. G. (2003) Characterizing degradation products of peptides containing N-terminal Cys residues by (off-line high-performance liquid chromatography)/matrix-assisted laser desorption/ionization quadrupole time-of-flight measurements. Rapid Commun. Mass Spectrom. 17, 2528 –2534

Molecular & Cellular Proteomics 3.9

919

Suggest Documents