Identification of the Protein Native Structure by ... - Semantic Scholar

1 downloads 0 Views 208KB Size Report
lmds. 2cro. 2ovo. 4pti. 2.64/0.77. 10/0. 7.37 /4.25. 3.06/2.12. 4.19/0.98. 1ctf. 1eh2. 1khm. 1nkl. 1pgb semfold. 2.89/2.51. 3.62/4.22. 1.60/2.65. 1.58/2.90. 1.66/1.55.
Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005, pp. 625∼630

Identification of the Protein Native Structure by Using a Sequence-Dependent Feature in Contact Maps Jaewoon Jung∗ and Hie-Tae Moon Department of Physics, Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 305-701

Jooyoung Lee† School of Computational Sciences, Korea Institute for Advanced Study, Dongdaemun-gu, Seoul 130-722 (Received 12 October 2004) We present a new approach for fold recognition to identify the native and the near-native protein structures among decoy structures by using pair-wise contact potentials between amino acid residues. For a given protein structure, a new scoring function is defined as the difference between the contact energy for its native sequence and the average contact energy for random sequences of the same contact map. We have tested the new scoring function for the various decoy sets available in the literature and have found that the new scoring function is more useful than the original contact energy, especially for decoy sets where the total number of contacts from the native structure is similar to those from the decoy conformations. From this observation, we conclude that the more native-like the structure is, the more likely that it distinguishes the native sequence from random sequences. We demonstrate that, for a given contact potential, a simple, but more efficient, new scoring function can be constructed. PACS numbers: 87.14.Ee, 87.15.By, 87.15.Cc Keywords: Protein folding, Native structure, Decoy structure

I. INTRODUCTION

ance of decoy states [14]. Most of these energy functions are knowledge-based potentials, and indications are that they are correlated with atomic potentials [16]. These previous efforts are focused on developing pairwise contact potentials that are able to discriminate the native states of a set of proteins from many structural decoys. In this work, we propose a different kind of approach where we adopt the pair-wise contact potential developed by Miyazawa and Jernigan (MJ) [6]. The MJ contact potential has been quite successful [6, 17]. The new scoring function is defined as the difference between the original energy function and the average energy obtained from random sequences. In the present work, we examine the performance of the new scoring function in fold recognition for various decoy sets. Especially, we investigate if the new scoring function is more useful than the original MJ contact energy in identifying native structures from many decoys. We find that the correlation between the new scoring function and the RMSD (Root Mean Square Deviation) measured from the native structure is more significant than the correlation between the original contact energy and the RMSD.

Determination of the three-dimensional structure of a protein from its amino-acid sequence is a major unsolved problem in structural biology [2,3]. One way to achieve this goal is to develop a scoring function that can distinguish the native structure of a protein from a large number of decoy conformations. For this reason many scoring functions, including empirical contact energies, have been investigated [4–15,18–21]. The traditional attempts for empirical contact energies are typically based on a residue-residue contact function obtained by using a quasi-chemical approximation [4–7]. The basic idea of this method is to investigate pairing frequencies between two amino acids, observed from various native structures in the Protein Data Bank, normalized against those expected from random pairing. Other contact energy functions are also obtained by optimizing interaction potentials so that the energy of the native structure becomes lower than the energy of competing decoy structures [9] and/or by maximizing the energy gap between the native state and the decoy states normalized by the energy vari∗ Current

Address : Department of Chemistry, Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 305-701 † E-mail: Corresponding author : [email protected]

II. METHOD -625-

-626-

Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005 1. Structure Comparison

To compare the two structures, the RMSD (root mean square deviation) is used. The RMSD between two structures a and b is defined as v X u 2 u |rai − rbi | u t 1≤i≤N , (1) RM SD = N where structures a and b are superimposed so that the value of the RMSD becomes minimum, r is the coordinates of alpha carbon atoms, and N is the number of amino acids.

we assume that it is not as important as the sequenceand-structure-dependent E − hEik . Finally, we assume that exp(−β(E(XN , SN ) − hEiN )) > exp(−β(E(Xk , SN ) − hEik )),

k 6= N.

(5)

Then, the following inequality is satisfied E(XN , SN ) − hEiN < E(Xk , SN ) − hEik .

(6)

From this, the new scoring function is defined as E −hEi.

III. RESULTS AND DISCUSSION 2. Contact Energy of a Protein

Before considering a new scoring function, we describe the definition of the contact energy of a protein. The contact energy of a protein is defined as X Ek = ∆i,j B(ai , aj ). (2)

Figure 1 shows the relationship between the RMSD measured from the native structure and the original MJ scoring function for the protein 1ctf in the 4-state reduced decoys. Figure 2 corresponds to the results obtained by using the new scoring function. Each data

i,j

In this equation, ∆i,j = 1 if the amino acids at the positions i and j are in contact; ∆i,j = 0, otherwise. The contact between amino acids i and j is defined to exist if their side chain centroids are within 6.5 ˚ A [5–7]. B(ai , aj ) is the pair-wise contact energy between amino acids of types ai and aj .

3. Calculation of New Scoring Functions

We assume that the probability that a protein adopts structure X and sequence S follows the Boltzmann distribution P (X, S) ∝ exp(−βE(X, S)).

(3)

Fig. 1. Relationship between the original MJ contact energy E and the RMSD. The energy is the sum of pair-wise MJ contact potentials when each structure is mounted on a native sequence. Here, the correlation is 0.34.

Then, the probability that a structure Xk and a native sequence SN are selected together becomes P (Xk , SN ) ∝ exp(−βE(Xk , SN )) = exp(−β(E(Xk , SN ) − hEik )) exp(−βhEik ), (4) where hEik is the average contact energy when the native sequence is replaced by random sequences. It should be noted that the energy contribution hEik is independent of its sequence SN and depends only on the structure Xk . The probability P (Xk , SN ) is considered to have two parts. One is the sequence- and structure-dependent E − hEik , and the other is the sequence-independent and structure-only-dependent hEik . Since the sequenceindependent term hEik plays the role of estimating only the total number of contacts for the given structure Xk ,

Fig. 2. Relationship between E − hEi and the RMSD. hEi is the average of the sum of pair-wise contact potentials calculated from 1000 random sequences. The correlation is 0.58.

Identification of the Protein Native Structure by· · · – Jaewoon Jung et al.

-627-

Table 1. Z−scores calculated using the original MJ contact energy E, and the modified scoring function E − hEi. decoy set 4-state reduced fisa fisa casp3 lattice ssfit

lmds

semfold

hg structal

ig structal

ig structal hires

protein 1ctf a

3.60/3.40

average

1r69

1sn3

2cro

3icb

4pti

4rxn

4.52/4.37

2.40/3.03

4.05/4.36

2.12/2.48

3.60/3.33

2.91/3.33

1fc2

1hdd-C

2cro

4icb

1.59/−0.01

3.07/2.24

4.33/2.73

5.98/4.40

1bg8-A

1bl0

1eh2

1jwe

smd3

3.27/2.19

1.77/−0.17

2.92/2.32

4.77/2.01

4.09/2.43

1beo

1ctf

1fca

1nkl

2.67/5.02

3.39/5.49

3.08/3.03

2.48/6.42

1b0n-B

1bba

1ctf

1dtk

1fc2

1igd

1shf-A

1.91/−0.05

−0.16/−1.63

3.79/3.31

3.99/0.67

−3.64/−6.43

3.45/3.34

2.47/1.16

2cro

2ovo

4pti

7.37 /4.25

3.06/2.12

4.19/0.98

1ctf

1eh2

1khm

1nkl

1pgb

2.89/2.51

3.62/4.22

1.60/2.65

1.58/2.90

1.66/1.55

1ash

1bab-B

1col-A

1cpc-A

1ecd

1emy

1flp

2.96/3.22

1.45/1.71

4.79/4.84

3.55/3.94

1.81/1.78

0.99/1.42

2.40/2.42

1gdm

1hbg

1hbh-A

1hbh-B

1hda-A

1hda-B

1hlb

2.65/2.53

2.16/1.87

0.89/1.05

0.96/1.35

0.94/1.87

1.89/1.85

0.83/2.17

1hlm

1hsy

1ith-A

1mba

1mbs

1myg-A

1myj-A

−2.69/−0.73

0.44/1.54

1.71/1.92

2.36/2.40

−0.75/−0.26

1.34/1.81

1.38/1.71

1myt

2dhb-A

2dhb-B

2lhb

2pgh-A

2pgh-B

4sdh-A

1.81/2.15

1.53/1.81

0.36/1.06

1.50/1.54

1.31/1.56

0.45/1.45

2.93/1.93

1acy

1baf

1bbd

1bbj

1dbb

1dfb

1dvf

−0.74/0.69

−0.71/0.10

−0.38/0.77

−0.37/1.29

−1.30/0.30

−1.02/0.11

0.20/0.40

1eap

1fai

1fbi

1fgv

1fig

1flr

1for

−0.74/0.40

−0.16/0.44

−0.85/0.18

−0.82/0.20

−1.50/0.80

0.20/0.49

−2.57/0.02

1fpt

1frg

1fvc

1fvd

1gaf

1ggi

1gig

−1.31/0.22

1.13/1.23

−0.12/0.43

0.04/0.87

−1.85/−0.06

0.18/1.05

0.96/1.08

1hil

1hkl

1iai

1ibg

1igc

1igf

1igi

−0.16/0.37

−1.63/0.17

−1.45/0.38

0.22/0.17

−0.44/0.68

0.63/0.77

−0.36/0.44

1igm

1ikf

1ind

1jel

1jhl

1kem

1mam

−0.68/0.40

−0.58/0.76

−0.08/1.58

−1.02/−0.33

−0.01/0.79

−0.65/1.13

−0.40/0.81

1mcp

1mlb

1mrd

1nbv

1ncb

1ngq

1nmb

−0.26/0.44

−0.67/0.74

−0.56/0.39

0.29/0.77

−0.09/0.75

−0.42/0.75

−0.65/0.69

1nsn

1opg

1plg

1rmf

1tet

1ucb

1vfa

−2.31/0.38

−0.32/−0.01

0.08/0.89

−1.54/−0.27

−0.72/0.31

−0.69/0.55

−0.54/0.58

1vge

1yuh

2cgr

2fb4

2fbj

2gfb

3hfl

−0.67/0.40

−1.03/0.13

0.09/1.35

0.68/1.82

−0.60/0.80

0.11/0.35

−0.78/0.30

3hfm

6fab

7fab

0.38/1.04

−0.62/0.75

−0.06/1.74

1dvf

1fgv

1flr

1fvc

1gaf

1hil

1ind

0.24/0.37

−0.61/0.27

0.26/0.52

−0.03/0.46

−1.45/0.07

0.02/0.46

−0.04/1.23

1kem

1mlb

1nbv

1opg

1vfa

1vge

2cgr

−0.59/1.02

−0.44/0.77

0.23/0.65

−0.40/0.01

−0.38/0.56

−0.32/0.55

0.22/1.31

2fb4

2fbj

6fab

7fab

0.48/1.35

−0.52/0.78

−0.46/0.75

−0.07/1.70

3.31/3.47

b

b

3/4

3.74/2.34

4/0

3.36/1.76

5/0

2.91/4.99

1/3

2.64/0.77

10/0

2.27/2.77

2/3

1.50/1.85

5/23

−0.50/0.61

1/58

−0.21/0.71

0/18

total a

comparison

36/109

For A/B, A and B corresponds to the Z-score of E and E − hEi, respectively For A/B, A and B corresponds to the number of proteins that E is superior and the number of proteins that E − hEi is superior, respectively

point represents a structure in the decoy set, and the point with RM SD = 0 corresponds to the native X-ray structure. From these figures, we observe that the new scoring function E − hEi has a higher correlation with the RMSD than the original contact energy E.

To compare the performances of E and E − hEi in more detail, we calculated the Z-scores of the native structures, the correlations between the RMSD and the scoring function, and the ranks of the native structures in various decoy sets. The results are summarized in

-628-

Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005

Table 2. The correlation calculated using the original MJ contact energy E, and the modified scoring function E − hEi. decoy set 4-state reduced fisa fisa casp3 lattice ssfit

lmds

semfold

hg structal

ig structal

ig structal hires

protein 1ctf a

0.34/0.58

average

1r69

1sn3

2cro

3icb

4pti

4rxn

0.21/0.50

0.19/0.38

0.39/0.57

0.43/0.68

0.16/0.29

0.27/0.49

1fc2

1hdd-C

2cro

4icb

0.22/0.35

0.17/0.31

0.18/0.20

0.17/0.12

1bg8-A

1bl0

1eh2

1jwe

smd3

0.26/0.19

0.38/0.40

0.26/0.23

−0.12/−0.22

0.19/−0.14

1beo

1ctf

1fca

1nkl

0.04/0.02

0.06/0.04

−0.02/−0.03

0.01/0.04

1b0n-B

1bba

1ctf

1dtk

1fc2

1igd

1shf-A

−0.13/−0.29

0.03/0.11

0.18/0.09

0.19/0.08

−0.03/−0.14

0.13/0.12

0.07/0.06

2cro

2ovo

4pti

0.11/0.02

0.17/0.21

0.06/−0.06

1ctf

1khm

1nkl

1pgb

0.09/0.10

0.08/0.04

0.02/0.04

0.04/0.06

1ash

1bab-B

1col-A

1cpc-A

1ecd

1emy

1flp

0.50/0.51

0.82/0.85

0.67/0.53

0.69/0.60

0.63/0.67

0.62/0.74

0.55/0.71

1gdm

1hbg

1hbh-A

1hbh-B

1hda-A

1hda-B

1hlb

0.78/0.85

0.46/0.60

0.81/0.81

0.78/0.82

0.85/0.84

0.84/0.89

0.52/0.57

1hlm

1hsy

1ith-A

1mba

1mbs

1myg-A

1myj-A

0.05/0.12

0.62/0.70

0.60/0.72

0.67/0.78

0.57/0.58

0.71/0.77

0.73/0.81

1myt

2dhb-A

2dhb-B

2lhb

2pgh-A

2pgh-B

4sdh-A

0.65/0.71

0.89/0.86

0.77/0.89

0.47/0.56

0.92/0.90

0.83/0.86

0.60/0.81

1acy

1baf

1bbd

1bbj

1dbb

1dfb

1dvf

0.49/0.57

0.55/0.51

0.39/0.49

0.44/0.50

0.47/0.54

0.37/0.40

0.48/0.50

1eap

1fai

1fbi

1fgv

1fig

1flr

1for

0.33/0.38

0.44/0.51

0.36/0.44

0.44/0.49

0.31/0.42

0.43/0.50

0.32/0.49

1fpt

1frg

1fvc

1fvd

1gaf

1ggi

1gig

0.40/0.49

0.54/0.60

0.17/0.09

0.50/0.58

0.38/0.43

0.49/0.52

0.36/0.33

1hil

1hkl

1iai

1ibg

1igc

1igf

1igi

0.51/0.58

0.37/0.44

0.45/0.54

0.22/0.21

0.53/0.55

0.53/0.58

0.20/0.23

1igm

1ikf

1ind

1jel

1jhl

1kem

1mam

0.43/0.55

0.33/0.36

0.39/0.43

0.36/0.45

0.40/0.36

0.45/0.52

0.17/0.27

1mcp

1mlb

1mrd

1nbv

1ncb

1ngq

1nmb

0.42/0.57

0.41/0.46

0.19/0.26

0.42/0.49

0.53/0.54

0.34/0.42

0.05/−0.01

1nsn

1opg

1plg

1rmf

1tet

1ucb

1vfa

0.32/0.52

0.45/0.45

0.47/0.52

0.44/0.50

0.43/0.56

0.58/0.58

0.18/0.25

1vge

1yuh

2cgr

2fb4

2fbj

2gfb

3hfl

0.01/0.13

0.16/0.13

0.42/0.57

0.44/0.54

0.42/0.42

0.23/0.19

0.02/0.12

3hfm

6fab

7fab

0.45/0.50

0.45/0.52

0.47/0.48

1dvf

1fgv

1flr

1fvc

1gaf

1hil

1ind

0.47/0.55

0.45/0.60

0.60/0.61

0.11/0.05

0.29/0.46

0.53/0.65

0.23/0.47

1kem

1mlb

1nbv

1opg

1vfa

1vge

2cgr

0.47/0.65

0.38/0.53

0.36/0.49

0.35/0.41

0.09/0.25

−0.10/0.08

0.43/0.72

2fb4

2fbj

6fab

7fab

0.41/0.66

0.55/0.57

0.39/0.62

0.49/0.66

0.28/0.50

comparison b

0/7

0.19/0.25

1/3

0.19/0.09

4/1

0.02/0.02

3/1

0.08/0.02

8/2

0.06/0.06

1/3

0.66/0.72

5/23

0.38/0.44

7/49

0.36/0.50

1/17

total

30/106

a

For A/B, A and B corresponds to the correlation with RMSD of E and E − hEi, respectively b For A/B, A and B corresponds to the number of proteins that E is superior and the number of proteins that E − hEi is superior, respectively

Tables 1- 3. First, the Z-scores are shown in Table 1. For each scoring function f (E and E−hEi), the Z-score is defined i as Z = − fN −hf , where fN is the scoring function of the σ

native structure, hf i is the average of the scoring function measured from decoy structures, and σ is the variance of the scoring function in decoy structures. A large value of the Z-score indicates that the native structure can be distinguished well from the decoy structures. Thus,

Identification of the Protein Native Structure by· · · – Jaewoon Jung et al.

-629-

Table 3. The rank of the native structure calculated using the original MJ contact energy E, and the modified scoring function E − hEi. decoy set 4-state reduced fisa fisa casp3 lattice ssfit

lmds

semfold

hg structal

ig structal

ig structal hires

protein 1ctf

comparison

1r69

1sn3

2cro

3icb

4pti

4rxn

1/1

3/1

1/1

10/2

1/1

2/1

1fc2

1hdd-C

2cro

4icb

30/263

2/10

1/3

1/1

1bg8-A

1bl0

1eh2

1jwe

smd3

1/17

42/552

2/28

1/33

1/10

1beo

1ctf

1fca

1nkl

13/1

2/1

1/1

13/1

1b0n-B

1bba

1ctf

1dtk

1fc2

1igd

1shf-A

17/261

281/482

1/1

1/57

501/501

1/1

4/52

2cro

2ovo

4pti

a

1/1

b

0/3 3/0 5/0 0/3

1/1

1/8

1/62

1ctf

1eh2

1khm

1nkl

1pgb

27/54

2/2

1245/68

654/24

562/696

1ash

1bab-B

1col-A

1cpc-A

1ecd

1emy

1/1

3/1

1/1

1/1

1/1

6/4

1/1

1gdm

1hbg

1hbh-A

1hbh-B

1hda-A

1hda-B

1hlb

6/0

2/2 1flp

1/1

1/1

9/9

8/2

6/1

1/1

7/1

1hlm

1hsy

1ith-A

1mba

1mbs

1myg-A

1myj-A

-30/22

9/3

1/1

1/1

25/19

5/2

5/2

1myt

2dhb-A

2dhb-B

2lhb

2pgh-A

2pgh-B

4sdh-A

2/1

3/1

15/2

2/2

3/3

13/1

1/1

1acy

1baf

1bbd

1bbj

1dbb

1dfb

1dvf

51/10

52/34

46/7

46/1

58/22

55/36

28/18

1eap

1fai

1fbi

1fgv

1fig

1flr

1for

54/20

29/18

54/28

53/32

58/4

30/17

59/39 1gig

1fpt

1frg

1fvc

1fvd

1gaf

1ggi

58/29

7/2

42/20

35/5

59/40

29/1

8/4

1hil

1hkl

1iai

1ibg

1igc

1igf

1igi

41/24

58/35

58/18

27/2

50/12

14/10

41/17

1igm

1ikf

1ind

1jel

1jhl

1kem

1mam

51/22

46/8

33/1

56/51

36/4

51/1

45/4

1mcp

1mlb

1mrd

1nbv

1ncb

1ngq

1nmb

39/19

53/4

54/21

27/6

43/3

47/6

50/9

1nsn

1opg

1plg

1rmf

1tet

1ucb

1vfa

59/20

45/40

32/4

57/50

54/24

54/13

48/12

1vge

1yuh

2cgr

2fb4

2fbj

2gfb

3hfl

52/20

56/28

33/1

13/2

51/8

31/20

56/25

3hfm

6fab

7fab

23/2

50/6

36/1

1dvf

1fgv

1flr

1fvc

1gaf

1hil

1ind

10/7

17/12

9/4

15/6

19/13

12/7

11/1

1kem

1mlb

1nbv

1opg

1vfa

1vge

2cgr

17/1

16/3

12/4

17/14

16/4

16/5

9/1

2fb4

2fbj

6fab

7fab

6/2

15/3

18/1

12/1

total a

b

0/14

0/59

0/18

16/99

For A/B, A and B corresponds to the rank of the native structure calculated by E and E − hEi, respectively For A/B, A and B corresponds to the number of proteins that E is superior and the number of proteins that E − hEi is superior, respectively

we want a scoring function that has a large Z-score. In Table 1, E performs better than E − hEi for proteins in the fisa, the fisa casp3, and the lmds decoy sets, and E − hEi performs better than E for most proteins in

the lattice ssfit, the hg structal, the ig structal, and the ig structal hires decoy sets (except for a few proteins). In Table 2, the correlations between each scoring function and the RMSD (from the native structure)

-630-

Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005

are shown. Even for the fisa, the fisa casp3, and the lmds proteins, E does not seem to have a higher correlation than E − hEi whereas E − hEi is superior to E for the 4-state reduced, hg structal, ig structal, and ig structal hires. This means that E − hEi is better than E in selecting native-like structures. It should be noted that the contact map of the native structure does not determine the native structure in a unique fashion; i.e., the reconstruction of the native structure from its contact map is not straightforward. However, contact maps are constructed by predetermined decoy structures, and the task of identifying the correct contact map of the native structure from among these decoy structures is important. Therefore, a good scoring function should show a good correlation with the RMSD measured from the native structure. When the correlation is high, a native-like structure is more likely to be identified as a native fold. Table 3 shows the ranks of native structures. Like in Table 1, E is superior to E − hEi for the fisa, the fisa casp3 and the lmds proteins, but for the other sets, E − hEi is superior. If Table 1 - 3 are considered, E seems to be better only for the fisa, the fisa casp3, and the lmds proteins, and E − hEi is better than for the other sets. The reason that E is better for fisa, fisa casp3, and lmds is as follows: For most proteins where E performs better than E − hEi, their native structures contain more contacts than the decoy structures do. That is, for these proteins, their native structures can be identified by considering only the total number of contacts, and the characteristics of E is not a discriminating factor. The rest of the cases where E − hEi did not perform better than E are for very small chains (less than 45 amino acids). In summary, for decoy sets where the total number of contacts from the native structure is more or less similar to the total numbers of contacts from decoy structures, E−hEi performs consistently better than E based on the Z-score, the correlation with the RMSD, and the rank of the native structure. For the set of 4-state reduced decoys, we investigated the difference between E and E − hEi. These decoys are generated by keeping most of the native conformation fixed in its native form [1]; therefore, their conformations have evenly distributed RMSD values. The set of 4-state reduced decoys has many near-native conformations, and the RMSDs are well distributed at low and high values. If these 4-state reduced decoys are considered, the difference in the performances between E and E − hEi from Table 1 and Table 3 is not significant, but Table 2 shows that E − hEi has a higher correlation with RMSD than E does for all proteins. This indicates that E −hEi could be more useful in finding native-like structures. IV. CONCLUSION For a given contact energy, we introduce a new scoring function, the difference between the original contact

energy and the average contact energy calculated from random sequences. The new scoring function is shown to perform better than the original contact energy for decoy sets where decoy structures have similar total numbers of contacts as the native structure. Out of 145 proteins from 9 decoy sets, the new scoring function is shown to be more useful for about 75 % of those proteins. From the results, we suggest a better approach to distinguish the native structure from decoy sets.

ACKNOWLEDGMENTS This work was supported by the Ministry of Science and Technology (Jung & Moon) and by grant No. R012003-000-11595-0 (Lee) from the Basic Research Program of the Korean Science & Engineering Foundation.

REFERENCES [1] B. Park and M. Levitt, J. Mol. Biol. 258, 367 (1996). [2] C. Anfinsen, Science 181, 223 (1973). [3] C. Branden and J. Tooze, Introduction to protein structure (New York, Freedman, 1991). [4] S. Tanaka and H. Scheraga, Macromolecules 9, 945 (1976). [5] S. Miyazawa and R. L. Jernigan, Macromolecules 18, 534 (1985). [6] S. Miyazawa and R. L. Jernigan, J. Mol. Biol 256, 623 (1996). [7] S. Miyazawa and R. L. Jernigan, Proteins: Struct. Funct. Genet. 34, 49 (1999). [8] D. Hinds and M. Levitt, Proc. Natl. Acad. Sci. USA 89, 2536 (1992). [9] D. Tobi and G. Shafran and N. Linial and R. Elber, Proteins: Struct. Funct. Genet. 40, 71 (2000). [10] J. Skolnick and A. Kolinski and A. Oritiz, Proteins: Struct. Funct. Genet. 38, 3 (2000). [11] I. Bahar and R. L. Jernigan, J. Mol. Biol 266, 195 (1996). [12] E. Huang and S. Subbiah and M. Levitt, J. Mol. Biol 252, 709 (1995). [13] E. Huang, S. Subbiah, J. Tsai and M. Levitt, J. Mol. Biol 257, 716 (1996). [14] L. Mirny and E. Shakhnovich, J. Mol. Biol 264, 1164 (1996). [15] B. Park and M. Levitt, J. Mol. Biol. 266, 831 (1997). [16] D. Mohanty and B. N. Dominy and A. Kolinski and C. L. Brooks and J. Skolnick, Proteins: Struct. Funct. Genet. 35, 447 (1999). [17] E. I. Shakhnovich, Phys. Rev. Lett, 72, 3907 (1994). [18] J. Lee and S. Y. Kim and J. Lee, J. Korean Phys. Soc. 44, 594 (2004). [19] J. Sim and S. Y. Kim and A. Yoo and J. Lee, J. Korean Phys. Soc. 44, 611 (2004). [20] M. Heo and S. Kim and E. J. Moon and M. Cheon and K. Chung and I. Chang, J. Korean Phys. Soc. 44, 1571 (2004). [21] M. Cheon, M. Heo, E. J. Moon, S. Kim, K. Chung, I. Chang and H. Kim, J. Korean Phys. Soc. 44, 550 (2004).