Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005, pp. 625∼630
Identification of the Protein Native Structure by Using a Sequence-Dependent Feature in Contact Maps Jaewoon Jung∗ and Hie-Tae Moon Department of Physics, Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 305-701
Jooyoung Lee† School of Computational Sciences, Korea Institute for Advanced Study, Dongdaemun-gu, Seoul 130-722 (Received 12 October 2004) We present a new approach for fold recognition to identify the native and the near-native protein structures among decoy structures by using pair-wise contact potentials between amino acid residues. For a given protein structure, a new scoring function is defined as the difference between the contact energy for its native sequence and the average contact energy for random sequences of the same contact map. We have tested the new scoring function for the various decoy sets available in the literature and have found that the new scoring function is more useful than the original contact energy, especially for decoy sets where the total number of contacts from the native structure is similar to those from the decoy conformations. From this observation, we conclude that the more native-like the structure is, the more likely that it distinguishes the native sequence from random sequences. We demonstrate that, for a given contact potential, a simple, but more efficient, new scoring function can be constructed. PACS numbers: 87.14.Ee, 87.15.By, 87.15.Cc Keywords: Protein folding, Native structure, Decoy structure
I. INTRODUCTION
ance of decoy states [14]. Most of these energy functions are knowledge-based potentials, and indications are that they are correlated with atomic potentials [16]. These previous efforts are focused on developing pairwise contact potentials that are able to discriminate the native states of a set of proteins from many structural decoys. In this work, we propose a different kind of approach where we adopt the pair-wise contact potential developed by Miyazawa and Jernigan (MJ) [6]. The MJ contact potential has been quite successful [6, 17]. The new scoring function is defined as the difference between the original energy function and the average energy obtained from random sequences. In the present work, we examine the performance of the new scoring function in fold recognition for various decoy sets. Especially, we investigate if the new scoring function is more useful than the original MJ contact energy in identifying native structures from many decoys. We find that the correlation between the new scoring function and the RMSD (Root Mean Square Deviation) measured from the native structure is more significant than the correlation between the original contact energy and the RMSD.
Determination of the three-dimensional structure of a protein from its amino-acid sequence is a major unsolved problem in structural biology [2,3]. One way to achieve this goal is to develop a scoring function that can distinguish the native structure of a protein from a large number of decoy conformations. For this reason many scoring functions, including empirical contact energies, have been investigated [4–15,18–21]. The traditional attempts for empirical contact energies are typically based on a residue-residue contact function obtained by using a quasi-chemical approximation [4–7]. The basic idea of this method is to investigate pairing frequencies between two amino acids, observed from various native structures in the Protein Data Bank, normalized against those expected from random pairing. Other contact energy functions are also obtained by optimizing interaction potentials so that the energy of the native structure becomes lower than the energy of competing decoy structures [9] and/or by maximizing the energy gap between the native state and the decoy states normalized by the energy vari∗ Current
Address : Department of Chemistry, Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 305-701 † E-mail: Corresponding author :
[email protected]
II. METHOD -625-
-626-
Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005 1. Structure Comparison
To compare the two structures, the RMSD (root mean square deviation) is used. The RMSD between two structures a and b is defined as v X u 2 u |rai − rbi | u t 1≤i≤N , (1) RM SD = N where structures a and b are superimposed so that the value of the RMSD becomes minimum, r is the coordinates of alpha carbon atoms, and N is the number of amino acids.
we assume that it is not as important as the sequenceand-structure-dependent E − hEik . Finally, we assume that exp(−β(E(XN , SN ) − hEiN )) > exp(−β(E(Xk , SN ) − hEik )),
k 6= N.
(5)
Then, the following inequality is satisfied E(XN , SN ) − hEiN < E(Xk , SN ) − hEik .
(6)
From this, the new scoring function is defined as E −hEi.
III. RESULTS AND DISCUSSION 2. Contact Energy of a Protein
Before considering a new scoring function, we describe the definition of the contact energy of a protein. The contact energy of a protein is defined as X Ek = ∆i,j B(ai , aj ). (2)
Figure 1 shows the relationship between the RMSD measured from the native structure and the original MJ scoring function for the protein 1ctf in the 4-state reduced decoys. Figure 2 corresponds to the results obtained by using the new scoring function. Each data
i,j
In this equation, ∆i,j = 1 if the amino acids at the positions i and j are in contact; ∆i,j = 0, otherwise. The contact between amino acids i and j is defined to exist if their side chain centroids are within 6.5 ˚ A [5–7]. B(ai , aj ) is the pair-wise contact energy between amino acids of types ai and aj .
3. Calculation of New Scoring Functions
We assume that the probability that a protein adopts structure X and sequence S follows the Boltzmann distribution P (X, S) ∝ exp(−βE(X, S)).
(3)
Fig. 1. Relationship between the original MJ contact energy E and the RMSD. The energy is the sum of pair-wise MJ contact potentials when each structure is mounted on a native sequence. Here, the correlation is 0.34.
Then, the probability that a structure Xk and a native sequence SN are selected together becomes P (Xk , SN ) ∝ exp(−βE(Xk , SN )) = exp(−β(E(Xk , SN ) − hEik )) exp(−βhEik ), (4) where hEik is the average contact energy when the native sequence is replaced by random sequences. It should be noted that the energy contribution hEik is independent of its sequence SN and depends only on the structure Xk . The probability P (Xk , SN ) is considered to have two parts. One is the sequence- and structure-dependent E − hEik , and the other is the sequence-independent and structure-only-dependent hEik . Since the sequenceindependent term hEik plays the role of estimating only the total number of contacts for the given structure Xk ,
Fig. 2. Relationship between E − hEi and the RMSD. hEi is the average of the sum of pair-wise contact potentials calculated from 1000 random sequences. The correlation is 0.58.
Identification of the Protein Native Structure by· · · – Jaewoon Jung et al.
-627-
Table 1. Z−scores calculated using the original MJ contact energy E, and the modified scoring function E − hEi. decoy set 4-state reduced fisa fisa casp3 lattice ssfit
lmds
semfold
hg structal
ig structal
ig structal hires
protein 1ctf a
3.60/3.40
average
1r69
1sn3
2cro
3icb
4pti
4rxn
4.52/4.37
2.40/3.03
4.05/4.36
2.12/2.48
3.60/3.33
2.91/3.33
1fc2
1hdd-C
2cro
4icb
1.59/−0.01
3.07/2.24
4.33/2.73
5.98/4.40
1bg8-A
1bl0
1eh2
1jwe
smd3
3.27/2.19
1.77/−0.17
2.92/2.32
4.77/2.01
4.09/2.43
1beo
1ctf
1fca
1nkl
2.67/5.02
3.39/5.49
3.08/3.03
2.48/6.42
1b0n-B
1bba
1ctf
1dtk
1fc2
1igd
1shf-A
1.91/−0.05
−0.16/−1.63
3.79/3.31
3.99/0.67
−3.64/−6.43
3.45/3.34
2.47/1.16
2cro
2ovo
4pti
7.37 /4.25
3.06/2.12
4.19/0.98
1ctf
1eh2
1khm
1nkl
1pgb
2.89/2.51
3.62/4.22
1.60/2.65
1.58/2.90
1.66/1.55
1ash
1bab-B
1col-A
1cpc-A
1ecd
1emy
1flp
2.96/3.22
1.45/1.71
4.79/4.84
3.55/3.94
1.81/1.78
0.99/1.42
2.40/2.42
1gdm
1hbg
1hbh-A
1hbh-B
1hda-A
1hda-B
1hlb
2.65/2.53
2.16/1.87
0.89/1.05
0.96/1.35
0.94/1.87
1.89/1.85
0.83/2.17
1hlm
1hsy
1ith-A
1mba
1mbs
1myg-A
1myj-A
−2.69/−0.73
0.44/1.54
1.71/1.92
2.36/2.40
−0.75/−0.26
1.34/1.81
1.38/1.71
1myt
2dhb-A
2dhb-B
2lhb
2pgh-A
2pgh-B
4sdh-A
1.81/2.15
1.53/1.81
0.36/1.06
1.50/1.54
1.31/1.56
0.45/1.45
2.93/1.93
1acy
1baf
1bbd
1bbj
1dbb
1dfb
1dvf
−0.74/0.69
−0.71/0.10
−0.38/0.77
−0.37/1.29
−1.30/0.30
−1.02/0.11
0.20/0.40
1eap
1fai
1fbi
1fgv
1fig
1flr
1for
−0.74/0.40
−0.16/0.44
−0.85/0.18
−0.82/0.20
−1.50/0.80
0.20/0.49
−2.57/0.02
1fpt
1frg
1fvc
1fvd
1gaf
1ggi
1gig
−1.31/0.22
1.13/1.23
−0.12/0.43
0.04/0.87
−1.85/−0.06
0.18/1.05
0.96/1.08
1hil
1hkl
1iai
1ibg
1igc
1igf
1igi
−0.16/0.37
−1.63/0.17
−1.45/0.38
0.22/0.17
−0.44/0.68
0.63/0.77
−0.36/0.44
1igm
1ikf
1ind
1jel
1jhl
1kem
1mam
−0.68/0.40
−0.58/0.76
−0.08/1.58
−1.02/−0.33
−0.01/0.79
−0.65/1.13
−0.40/0.81
1mcp
1mlb
1mrd
1nbv
1ncb
1ngq
1nmb
−0.26/0.44
−0.67/0.74
−0.56/0.39
0.29/0.77
−0.09/0.75
−0.42/0.75
−0.65/0.69
1nsn
1opg
1plg
1rmf
1tet
1ucb
1vfa
−2.31/0.38
−0.32/−0.01
0.08/0.89
−1.54/−0.27
−0.72/0.31
−0.69/0.55
−0.54/0.58
1vge
1yuh
2cgr
2fb4
2fbj
2gfb
3hfl
−0.67/0.40
−1.03/0.13
0.09/1.35
0.68/1.82
−0.60/0.80
0.11/0.35
−0.78/0.30
3hfm
6fab
7fab
0.38/1.04
−0.62/0.75
−0.06/1.74
1dvf
1fgv
1flr
1fvc
1gaf
1hil
1ind
0.24/0.37
−0.61/0.27
0.26/0.52
−0.03/0.46
−1.45/0.07
0.02/0.46
−0.04/1.23
1kem
1mlb
1nbv
1opg
1vfa
1vge
2cgr
−0.59/1.02
−0.44/0.77
0.23/0.65
−0.40/0.01
−0.38/0.56
−0.32/0.55
0.22/1.31
2fb4
2fbj
6fab
7fab
0.48/1.35
−0.52/0.78
−0.46/0.75
−0.07/1.70
3.31/3.47
b
b
3/4
3.74/2.34
4/0
3.36/1.76
5/0
2.91/4.99
1/3
2.64/0.77
10/0
2.27/2.77
2/3
1.50/1.85
5/23
−0.50/0.61
1/58
−0.21/0.71
0/18
total a
comparison
36/109
For A/B, A and B corresponds to the Z-score of E and E − hEi, respectively For A/B, A and B corresponds to the number of proteins that E is superior and the number of proteins that E − hEi is superior, respectively
point represents a structure in the decoy set, and the point with RM SD = 0 corresponds to the native X-ray structure. From these figures, we observe that the new scoring function E − hEi has a higher correlation with the RMSD than the original contact energy E.
To compare the performances of E and E − hEi in more detail, we calculated the Z-scores of the native structures, the correlations between the RMSD and the scoring function, and the ranks of the native structures in various decoy sets. The results are summarized in
-628-
Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005
Table 2. The correlation calculated using the original MJ contact energy E, and the modified scoring function E − hEi. decoy set 4-state reduced fisa fisa casp3 lattice ssfit
lmds
semfold
hg structal
ig structal
ig structal hires
protein 1ctf a
0.34/0.58
average
1r69
1sn3
2cro
3icb
4pti
4rxn
0.21/0.50
0.19/0.38
0.39/0.57
0.43/0.68
0.16/0.29
0.27/0.49
1fc2
1hdd-C
2cro
4icb
0.22/0.35
0.17/0.31
0.18/0.20
0.17/0.12
1bg8-A
1bl0
1eh2
1jwe
smd3
0.26/0.19
0.38/0.40
0.26/0.23
−0.12/−0.22
0.19/−0.14
1beo
1ctf
1fca
1nkl
0.04/0.02
0.06/0.04
−0.02/−0.03
0.01/0.04
1b0n-B
1bba
1ctf
1dtk
1fc2
1igd
1shf-A
−0.13/−0.29
0.03/0.11
0.18/0.09
0.19/0.08
−0.03/−0.14
0.13/0.12
0.07/0.06
2cro
2ovo
4pti
0.11/0.02
0.17/0.21
0.06/−0.06
1ctf
1khm
1nkl
1pgb
0.09/0.10
0.08/0.04
0.02/0.04
0.04/0.06
1ash
1bab-B
1col-A
1cpc-A
1ecd
1emy
1flp
0.50/0.51
0.82/0.85
0.67/0.53
0.69/0.60
0.63/0.67
0.62/0.74
0.55/0.71
1gdm
1hbg
1hbh-A
1hbh-B
1hda-A
1hda-B
1hlb
0.78/0.85
0.46/0.60
0.81/0.81
0.78/0.82
0.85/0.84
0.84/0.89
0.52/0.57
1hlm
1hsy
1ith-A
1mba
1mbs
1myg-A
1myj-A
0.05/0.12
0.62/0.70
0.60/0.72
0.67/0.78
0.57/0.58
0.71/0.77
0.73/0.81
1myt
2dhb-A
2dhb-B
2lhb
2pgh-A
2pgh-B
4sdh-A
0.65/0.71
0.89/0.86
0.77/0.89
0.47/0.56
0.92/0.90
0.83/0.86
0.60/0.81
1acy
1baf
1bbd
1bbj
1dbb
1dfb
1dvf
0.49/0.57
0.55/0.51
0.39/0.49
0.44/0.50
0.47/0.54
0.37/0.40
0.48/0.50
1eap
1fai
1fbi
1fgv
1fig
1flr
1for
0.33/0.38
0.44/0.51
0.36/0.44
0.44/0.49
0.31/0.42
0.43/0.50
0.32/0.49
1fpt
1frg
1fvc
1fvd
1gaf
1ggi
1gig
0.40/0.49
0.54/0.60
0.17/0.09
0.50/0.58
0.38/0.43
0.49/0.52
0.36/0.33
1hil
1hkl
1iai
1ibg
1igc
1igf
1igi
0.51/0.58
0.37/0.44
0.45/0.54
0.22/0.21
0.53/0.55
0.53/0.58
0.20/0.23
1igm
1ikf
1ind
1jel
1jhl
1kem
1mam
0.43/0.55
0.33/0.36
0.39/0.43
0.36/0.45
0.40/0.36
0.45/0.52
0.17/0.27
1mcp
1mlb
1mrd
1nbv
1ncb
1ngq
1nmb
0.42/0.57
0.41/0.46
0.19/0.26
0.42/0.49
0.53/0.54
0.34/0.42
0.05/−0.01
1nsn
1opg
1plg
1rmf
1tet
1ucb
1vfa
0.32/0.52
0.45/0.45
0.47/0.52
0.44/0.50
0.43/0.56
0.58/0.58
0.18/0.25
1vge
1yuh
2cgr
2fb4
2fbj
2gfb
3hfl
0.01/0.13
0.16/0.13
0.42/0.57
0.44/0.54
0.42/0.42
0.23/0.19
0.02/0.12
3hfm
6fab
7fab
0.45/0.50
0.45/0.52
0.47/0.48
1dvf
1fgv
1flr
1fvc
1gaf
1hil
1ind
0.47/0.55
0.45/0.60
0.60/0.61
0.11/0.05
0.29/0.46
0.53/0.65
0.23/0.47
1kem
1mlb
1nbv
1opg
1vfa
1vge
2cgr
0.47/0.65
0.38/0.53
0.36/0.49
0.35/0.41
0.09/0.25
−0.10/0.08
0.43/0.72
2fb4
2fbj
6fab
7fab
0.41/0.66
0.55/0.57
0.39/0.62
0.49/0.66
0.28/0.50
comparison b
0/7
0.19/0.25
1/3
0.19/0.09
4/1
0.02/0.02
3/1
0.08/0.02
8/2
0.06/0.06
1/3
0.66/0.72
5/23
0.38/0.44
7/49
0.36/0.50
1/17
total
30/106
a
For A/B, A and B corresponds to the correlation with RMSD of E and E − hEi, respectively b For A/B, A and B corresponds to the number of proteins that E is superior and the number of proteins that E − hEi is superior, respectively
Tables 1- 3. First, the Z-scores are shown in Table 1. For each scoring function f (E and E−hEi), the Z-score is defined i as Z = − fN −hf , where fN is the scoring function of the σ
native structure, hf i is the average of the scoring function measured from decoy structures, and σ is the variance of the scoring function in decoy structures. A large value of the Z-score indicates that the native structure can be distinguished well from the decoy structures. Thus,
Identification of the Protein Native Structure by· · · – Jaewoon Jung et al.
-629-
Table 3. The rank of the native structure calculated using the original MJ contact energy E, and the modified scoring function E − hEi. decoy set 4-state reduced fisa fisa casp3 lattice ssfit
lmds
semfold
hg structal
ig structal
ig structal hires
protein 1ctf
comparison
1r69
1sn3
2cro
3icb
4pti
4rxn
1/1
3/1
1/1
10/2
1/1
2/1
1fc2
1hdd-C
2cro
4icb
30/263
2/10
1/3
1/1
1bg8-A
1bl0
1eh2
1jwe
smd3
1/17
42/552
2/28
1/33
1/10
1beo
1ctf
1fca
1nkl
13/1
2/1
1/1
13/1
1b0n-B
1bba
1ctf
1dtk
1fc2
1igd
1shf-A
17/261
281/482
1/1
1/57
501/501
1/1
4/52
2cro
2ovo
4pti
a
1/1
b
0/3 3/0 5/0 0/3
1/1
1/8
1/62
1ctf
1eh2
1khm
1nkl
1pgb
27/54
2/2
1245/68
654/24
562/696
1ash
1bab-B
1col-A
1cpc-A
1ecd
1emy
1/1
3/1
1/1
1/1
1/1
6/4
1/1
1gdm
1hbg
1hbh-A
1hbh-B
1hda-A
1hda-B
1hlb
6/0
2/2 1flp
1/1
1/1
9/9
8/2
6/1
1/1
7/1
1hlm
1hsy
1ith-A
1mba
1mbs
1myg-A
1myj-A
-30/22
9/3
1/1
1/1
25/19
5/2
5/2
1myt
2dhb-A
2dhb-B
2lhb
2pgh-A
2pgh-B
4sdh-A
2/1
3/1
15/2
2/2
3/3
13/1
1/1
1acy
1baf
1bbd
1bbj
1dbb
1dfb
1dvf
51/10
52/34
46/7
46/1
58/22
55/36
28/18
1eap
1fai
1fbi
1fgv
1fig
1flr
1for
54/20
29/18
54/28
53/32
58/4
30/17
59/39 1gig
1fpt
1frg
1fvc
1fvd
1gaf
1ggi
58/29
7/2
42/20
35/5
59/40
29/1
8/4
1hil
1hkl
1iai
1ibg
1igc
1igf
1igi
41/24
58/35
58/18
27/2
50/12
14/10
41/17
1igm
1ikf
1ind
1jel
1jhl
1kem
1mam
51/22
46/8
33/1
56/51
36/4
51/1
45/4
1mcp
1mlb
1mrd
1nbv
1ncb
1ngq
1nmb
39/19
53/4
54/21
27/6
43/3
47/6
50/9
1nsn
1opg
1plg
1rmf
1tet
1ucb
1vfa
59/20
45/40
32/4
57/50
54/24
54/13
48/12
1vge
1yuh
2cgr
2fb4
2fbj
2gfb
3hfl
52/20
56/28
33/1
13/2
51/8
31/20
56/25
3hfm
6fab
7fab
23/2
50/6
36/1
1dvf
1fgv
1flr
1fvc
1gaf
1hil
1ind
10/7
17/12
9/4
15/6
19/13
12/7
11/1
1kem
1mlb
1nbv
1opg
1vfa
1vge
2cgr
17/1
16/3
12/4
17/14
16/4
16/5
9/1
2fb4
2fbj
6fab
7fab
6/2
15/3
18/1
12/1
total a
b
0/14
0/59
0/18
16/99
For A/B, A and B corresponds to the rank of the native structure calculated by E and E − hEi, respectively For A/B, A and B corresponds to the number of proteins that E is superior and the number of proteins that E − hEi is superior, respectively
we want a scoring function that has a large Z-score. In Table 1, E performs better than E − hEi for proteins in the fisa, the fisa casp3, and the lmds decoy sets, and E − hEi performs better than E for most proteins in
the lattice ssfit, the hg structal, the ig structal, and the ig structal hires decoy sets (except for a few proteins). In Table 2, the correlations between each scoring function and the RMSD (from the native structure)
-630-
Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005
are shown. Even for the fisa, the fisa casp3, and the lmds proteins, E does not seem to have a higher correlation than E − hEi whereas E − hEi is superior to E for the 4-state reduced, hg structal, ig structal, and ig structal hires. This means that E − hEi is better than E in selecting native-like structures. It should be noted that the contact map of the native structure does not determine the native structure in a unique fashion; i.e., the reconstruction of the native structure from its contact map is not straightforward. However, contact maps are constructed by predetermined decoy structures, and the task of identifying the correct contact map of the native structure from among these decoy structures is important. Therefore, a good scoring function should show a good correlation with the RMSD measured from the native structure. When the correlation is high, a native-like structure is more likely to be identified as a native fold. Table 3 shows the ranks of native structures. Like in Table 1, E is superior to E − hEi for the fisa, the fisa casp3 and the lmds proteins, but for the other sets, E − hEi is superior. If Table 1 - 3 are considered, E seems to be better only for the fisa, the fisa casp3, and the lmds proteins, and E − hEi is better than for the other sets. The reason that E is better for fisa, fisa casp3, and lmds is as follows: For most proteins where E performs better than E − hEi, their native structures contain more contacts than the decoy structures do. That is, for these proteins, their native structures can be identified by considering only the total number of contacts, and the characteristics of E is not a discriminating factor. The rest of the cases where E − hEi did not perform better than E are for very small chains (less than 45 amino acids). In summary, for decoy sets where the total number of contacts from the native structure is more or less similar to the total numbers of contacts from decoy structures, E−hEi performs consistently better than E based on the Z-score, the correlation with the RMSD, and the rank of the native structure. For the set of 4-state reduced decoys, we investigated the difference between E and E − hEi. These decoys are generated by keeping most of the native conformation fixed in its native form [1]; therefore, their conformations have evenly distributed RMSD values. The set of 4-state reduced decoys has many near-native conformations, and the RMSDs are well distributed at low and high values. If these 4-state reduced decoys are considered, the difference in the performances between E and E − hEi from Table 1 and Table 3 is not significant, but Table 2 shows that E − hEi has a higher correlation with RMSD than E does for all proteins. This indicates that E −hEi could be more useful in finding native-like structures. IV. CONCLUSION For a given contact energy, we introduce a new scoring function, the difference between the original contact
energy and the average contact energy calculated from random sequences. The new scoring function is shown to perform better than the original contact energy for decoy sets where decoy structures have similar total numbers of contacts as the native structure. Out of 145 proteins from 9 decoy sets, the new scoring function is shown to be more useful for about 75 % of those proteins. From the results, we suggest a better approach to distinguish the native structure from decoy sets.
ACKNOWLEDGMENTS This work was supported by the Ministry of Science and Technology (Jung & Moon) and by grant No. R012003-000-11595-0 (Lee) from the Basic Research Program of the Korean Science & Engineering Foundation.
REFERENCES [1] B. Park and M. Levitt, J. Mol. Biol. 258, 367 (1996). [2] C. Anfinsen, Science 181, 223 (1973). [3] C. Branden and J. Tooze, Introduction to protein structure (New York, Freedman, 1991). [4] S. Tanaka and H. Scheraga, Macromolecules 9, 945 (1976). [5] S. Miyazawa and R. L. Jernigan, Macromolecules 18, 534 (1985). [6] S. Miyazawa and R. L. Jernigan, J. Mol. Biol 256, 623 (1996). [7] S. Miyazawa and R. L. Jernigan, Proteins: Struct. Funct. Genet. 34, 49 (1999). [8] D. Hinds and M. Levitt, Proc. Natl. Acad. Sci. USA 89, 2536 (1992). [9] D. Tobi and G. Shafran and N. Linial and R. Elber, Proteins: Struct. Funct. Genet. 40, 71 (2000). [10] J. Skolnick and A. Kolinski and A. Oritiz, Proteins: Struct. Funct. Genet. 38, 3 (2000). [11] I. Bahar and R. L. Jernigan, J. Mol. Biol 266, 195 (1996). [12] E. Huang and S. Subbiah and M. Levitt, J. Mol. Biol 252, 709 (1995). [13] E. Huang, S. Subbiah, J. Tsai and M. Levitt, J. Mol. Biol 257, 716 (1996). [14] L. Mirny and E. Shakhnovich, J. Mol. Biol 264, 1164 (1996). [15] B. Park and M. Levitt, J. Mol. Biol. 266, 831 (1997). [16] D. Mohanty and B. N. Dominy and A. Kolinski and C. L. Brooks and J. Skolnick, Proteins: Struct. Funct. Genet. 35, 447 (1999). [17] E. I. Shakhnovich, Phys. Rev. Lett, 72, 3907 (1994). [18] J. Lee and S. Y. Kim and J. Lee, J. Korean Phys. Soc. 44, 594 (2004). [19] J. Sim and S. Y. Kim and A. Yoo and J. Lee, J. Korean Phys. Soc. 44, 611 (2004). [20] M. Heo and S. Kim and E. J. Moon and M. Cheon and K. Chung and I. Chang, J. Korean Phys. Soc. 44, 1571 (2004). [21] M. Cheon, M. Heo, E. J. Moon, S. Kim, K. Chung, I. Chang and H. Kim, J. Korean Phys. Soc. 44, 550 (2004).