Proteins-Graph Theoretical Model Using Beta Turn

0 downloads 0 Views 160KB Size Report
Integrated Intelligent Research (IIR). 638. Proteins-Graph Theoretical Model Using Beta Turn Structure. D. Vijayalakshmi1, K. Srinivasa Rao1, K. Sivakumar2.
Volume: 03, February 2014, Pages: 638-641

International Journal of Computing Algorithm

Proteins-Graph Theoretical Model Using Beta Turn Structure D. Vijayalakshmi1, K. Srinivasa Rao1, K. Sivakumar2 Department of Mathematics, Sri Chandrasekharendra Saraswathi Viswa Mahavidyalaya University, Kanchipuram - 631 561, Tamilnadu. 2 Department of Chemistry, Sri Chandrasekharendra Saraswathi Viswa Mahavidyalaya University, Kanchipuram - 631 561, Tamilnadu. Email: [email protected]

1

Abstract The first step in the study of protein is analyzing structural similarity. This is done using various concepts especially using graph theoretical methods. In this paper, protein the macromolecule of biology is converted into a graph using the beta turn secondary structure. This graph emphasis the homologous method of study of protein and is very much helpful in the study of similarity. Keywords: Graph model, Secondary structure, Beta turns. 1. Introduction. 1.1Importance of protein similarity. There are sequence databases literally containing millions of biosequences.Comparing these sequences is the top task in bioinformatics i.e., the study of levels of structural similarity that exist between these proteins is analysed. This is done because (i) this study is the fundamental step towards the understanding of technique used by biological organisms in constructing stable and functional proteins and (ii) The structures are considered as they are closer to functions than sequence. In particular, the functional similarity of proteins can be predicted appropriately using structural similarity. Secondary structures are recurrently used in this study as they are strongly related to functions of proteins. Several methods exist to compare protein structures and to measure the degree of structural similarity between them. These methods based on scalar distance comparison (Holm&Sander 1993),difference of vector difference plot( Orenga and taylor ,1996), minimizing soap bubble surface area between two protein backbones (Falicov and Cohen 1996)[9] The algorithms VAST [1], MWBM [2] are used for protein structure representation. In general, the graph based model of protein uses the concept of isomorphism to identify the similarity of proteins. Some graph theoretical methods for graph model of proteins are (i) method using contact maps [3] (ii) maximum weighted bipartite matching algorithm [4] where proteins are compared by a bipartite graph (iii)

Integrated Intelligent Research (IIR)

comparison of helices and sheets [5] using bipartite graph matching techniques and some more of similar type .These methods also study the similarity of proteins. 1.2 Homologous method. This method analyzes the similarity of amino acid sequences using alignment. For instance, the way of computing the distance between sequences, graphical representation and mathematical descriptors abstracted from some matrices based on graphical representation etc. Some good example are representing the amino acids on the periphery of magic circle [7], analysis of protein sequence using spatial median [8], graphical alignment of protein[9] etc. In the current work we study the similarity of protein by converting a protein to a graph. Protein is transformed into a graph using the beta turn structure. The 3D coordinates of carbon alpha atom of each amino acid are considered as a parameter. . 2. Graph representation of protein On gathering some information from PDB file the protein is converted into a graph using the secondary structure element – beta turns. Each amino acid is identified by its 3d coordinate of the carbon alpha atom. Simply adopt each beta turn as a vertex. Accordingly the vertices are said to connect if the Euclidean distance between them is minimum among the group. The protein -2k61 PDB ID we consider to illustrate our method. The details of the secondary structure of 2k61 are given in the Figure 2.1

638

Volume: 03, February 2014, Pages: 638-641

International Journal of Computing Algorithm

Figure 2.1: Details of protein 2k61 The beta turns are the vertices and the amino acids corresponding to these turns are given in Table 2.1 Name Turn Sequence b1 Asp20-Gly23 DKDG b2 Arg37-Gly40 RSLG b3 Asn53-Asp56 NEVD b4 Asp56-Gly59 DADG b5 Lys77-Asp80 KDTD b6 Asp93-Gly96 DKDG b7 Asp129-Gly132 DIDG Table 2.1 Beta turns The average of the 3d coordinates of carbon alpha atoms of the amino acids constituting the beta turn structure is the centroid of the structure. The distance between each centroid defines the edges between the vertices. For instance, we consider the beta turn b2 and the list of distances is given in Table 2.2 Turn1

Turn2

Distance

b6

b2

12.46429

b1

b2

17.84959

b5

b2

18.92782

b3

b2

21.26323

b4

b2

26.3482

b7

b2

26.63654

Table 2.2 Distance between b2 and remaining vertices.

Integrated Intelligent Research (IIR)

First two minimum distances are taken into account and edges are drawn between b2 and

639

Volume: 03, February 2014, Pages: 638-641

International Journal of Computing Algorithm

b6, b2 and b1.The graph for a protein is obtained in the similar way. The graph for the protein 2k61 is given in Figure 2.2

→ V2 which preserves adjacent vertices (i.e. if v1, v2 are adjacent in graph 1, then f (v1), f (v2) must be adjacent in graph 2). If the graphs are not simple, we need more sophisticated methods to check for when two graphs are isomorphic. However, it is often simple to show that two graphs are not isomorphic. We can do this by showing any of the following seven conditions are true. 1. The two graphs have different numbers of vertices. 2. The two graphs have different numbers of edges. 3. One graph has parallel edges and the other does not. 4. One graph has a loop and the other does not. 5. One graph has a vertex of degree k (for example) and the other does not. 6. One graph is connected and the other is not. 7. One graph has a cycle and the other not.

Figure 2.2 Graph model of 2k61 3. Similarity / Dissimilarity Analysis We have applied our approach to the segments of 2k61, 1chq and 1chp proteins. Each protein has seven beta turn structures and the graph obtained is in Figure 3.1.

Figure 3.1 Graph model of 1CHP

Figure 3.2 Graph model of 1CHP. The Similarity / Dissimilarity analysis can be measured by isomorphism theory. The theory is below. A graph isomorphism f:G→H is a pair of bijections fV : VG →VH and fE : EG →EH such that for every edge e ε EG , the function fV maps the endpoints of e to the fE(e). It is not always easy to establish two graphs are isomorphic or not. For simple graphs we just need to check if there is a bijection f: V1 Integrated Intelligent Research (IIR)

4. Result and Discussion Firstly, this approach is applied on three proteins. The graph model of proteins is in Figure 2.2, Figure 3.1and Figure 3.2. The basic idea behind our method is “If the graphs of the protein are isomorphic then proteins can be similar”. It is trivial; we can define a bijection function from the vertex set of 1CHQ protein to the vertex set 1CHP protein. These shows the graphs are isomorphic and hence the proteins are similar. On the other hand, 2k61 protein graph is not isomorphic with the other two as it satisfies the points given in non isomorphic part of the above section. This implies the 2k61 protein is dissimilar to 1CHQ and 1CHP proteins. The obtained result is then compared to blast two sequence software and it agrees with the results in blast sequence. Conclusion In this paper, a novel method is introduced for similarity/dissimilarity study of proteins which has an important advantage over the existing methods in being very compact. Moreover we construct graph for a protein in four different ways and the graph obtained are similar. This method can also be said as sequence order independent method and it makes the 3d comparison of proteins easy with sequence of alpha carbon atom of residues or secondary structure elements. This definition of procedure avoids the need for precise knowledge about protein.

640

Volume: 03, February 2014, Pages: 638-641

International Journal of Computing Algorithm

Reference [1] Gibrat, J.F., Madel, T., and Bryant, S.H., Surprising similarities in structure comparison. Curr. Opin. Struct. Biol., 6, 1996, 377-385. [2] Wang, Y., Makedon, F., Ford, J., A Bipartite Graph Matching Framework for Finding Correspondences between Structural Elements in Two Proteins, In Proceedings of the 26th Annual International Conference of the IEEE Engineering in Medicine and BiologySociety, , San Francisco, California, 2004, 2972-2975. [3] Holm L., Sander C. Protein structure comparison by alignment of distance matrices, J Mol Biol.; 233 (1) Sep 5 ,1993,123-38 8377180. [4] Shindyalov IN, Bourne PE Protein structure alignment by incremental combinatorial extension (CE) of the optimal path, Protein Engineering 11(9),1998, 739747.

[5] Taylor W. Protein Structure Comparison Using bipartite Graph Matching and Its Application to Protein Structure Classification, Molecular & Cellular Proteomics1: 2002, 334-339. [6] Milan Randic, Butina.D, Zupan.J,Novel 2D graphical representation of proteins.,J.Chem.Phys.Lett.419,2006,528. [7] Mervat M. ,Abo -Elkhier, Similarity/dissimilarity analysis of protein sequences using the spatial median as a descriptor, J.Biophysical chemistry Vol.3,N0.2, 2012,142-148. [8] Milan Randic,On a geometry – based approach to protein sequence alignment, J.Mathematical chemistry,Vol.43,No.2,2008. [9] Amit P.Singh and Douglas L Brutlag, Hierarchical protein structure superposition using both secondary structure and atomic representation, proceedings of 5th international conference on intelligent system for molecular biology 1997 ,284-293.

Integrated Intelligent Research (IIR)

641