A Collaborative Environment for Protein Visualization - Informatics ...

2 downloads 2590 Views 41KB Size Report
being developed under the name of Protein Explorer. ... alization software developed using the Java3D API. ..... Java 3D restricts scene graphs in one major.
A Collaborative Environment for Protein Visualization Tolga Can

Yujun Wang

Yuan Fang Wang

Jianwen Su

[email protected] [email protected] [email protected] [email protected] Department of Computer Science, University of California at Santa Barbara, Santa Barbara, CA 93106, U.S.A.

Abstract We have developed a framework of distributed visualization and collaboration to help biochemistry and genomics researchers in understanding the function and nature of proteins. Our visualization system allows the examination of a molecule’s different visual representations and sharing the views amongst geographically distributed researchers. Our system provides the users with different types of visual representations existing in many of today’s protein visualization systems. These 3D models include backbone, space-fill, ball-stick, and ribbon models. Furthermore, we provide an interactive tree representation, both for primary and secondary structures, which helps the users understand the underlying hierarchy. Our collaboration system provides a variety of annotating mechanisms for annotating parts of interest. Annotation of the 3D representations of protein molecules is provided through the concept called the sticky notes. These notes, which we believe is an invaluable tool for sharing ideas, can be associated with a whole protein molecule or some part of the molecule in the molecular hierarchy. The notes in our system are capable of conveying multimedia information (e.g. audio, video) as well as textual information. Furthermore, their capabilities go beyond containing information to performing actions and monitoring the environment.

1

Introduction

Protein visualization has become an important research topic, especially in light of the accomplishment of the Human Genome Project [1]. Proteins are made up of amino acids linearly chained together. However, they don’t stay in a linear structure after they are created, they fold into various shapes. The problem of finding which way a particular amino acid chain is going to fold is called the protein folding problem [18]. Finding the resulting 3D shape theoretically is considered one of the hardest prob-

lems in bio-informatics. Another way to find the 3D structures of proteins is to do so experimentally using methods such as X-Ray Crystallography [20]. This method gives the relative 3D coordinates of atoms of a single protein molecule. No matter if protein structures are found theoretically or experimentally, to understand the protein function and nature this data should be examined with the aid of visualization software. This is because that the 3D structures of proteins are important for their interaction with other molecules. For example, hemoglobin, being in a cup shape, has the ability to carry oxygen molecules in the blood stream. There are many well established ways of visualizing the 3D protein structures. Each way of visualization highlights a different aspect of the protein structure, as mentioned by Clay Shirky [2]. Collaborative software environments make it possible for a group of people to work on the same task simultaneously by providing synchronized views of and controlled accesses to the common objects (see [3]). Such systems can overcome the barriers of physical distribution and even time separation of the users. A number of software systems have been developed for collaborative 3D modeling [4, 5, 6]. Clearly, collaborative 3D modeling systems specially tailored for protein visualization can enhance accesses to protein data and promote communications and collaboration among distributed researchers. The goal of our research is thus to provide such a distributed visualization environment. The novelty of our system, though, lies in its expanded functionality of distribution and collaboration. In short, distributed protein visualization and collaboration is to enable multiple geographically distributed users to visualize and annotate a shared 3D view of a protein in an interactive manner. This gives the users the ability to communicate ideas and cooperate us-

to view several types of annotations attached to a 3D model, since it uses the MIME (Multipurpose Internet Mail Extensions) standard for annotations, which is a framework for interchange of a variety of multimedia contents among different viewers. Iris Annotator [7] is a tool for asynchronous collaboration via annotation of 3D models. It is a multimedia application designed for sharing ideas and information about 3D models. The drawback is that it is not a distributed tool capable of synchronous collaboration. IRIS Annotator lets the users annotate 3D models, then electronically mails the annotated models to others for review. On the bio-informatics side, many tools have also been developed to visualize a protein whose structure has been known. One of the earliest of those tools is Roger Sayle’s RasMol [9]. RasMol is now being developed under the name of Protein Explorer. SwissPdbViewer [12], which is tightly linked to the automated protein modeling server Swiss-Model, provides a user-friendly interface to analyze several proteins at the same time. VMD [15] is a molecular graphics program designed for the display and analysis of molecular assemblies, in particular biopolymers such as proteins and nucleic acids. VMD can simultaneously display protein structures using a variety of rendering styles and coloring methods. MOLMOL [10] is yet another molecular graphics program for the display, analysis, and manipulation of the 3D structures of biological macromolecules, with special emphasis on nuclear magnetic resonance (NMR) solution structures of proteins and nucleic acids. There are a few protein visualization tools which are developed using Java. MOLVIE [11], a molecular visualization environment, is one of them. WebMol [13] is a protein structure viewing and analysis program, which has more functionality, but limited 3D model types. JMVS [14] is another visualization software developed using the Java3D API. These tools described above are stand-alone applications focusing on visualizing a wide range of proteins. Generally, they do not address collaboration. MICE [16], a collaborative interactive molecular visualization environment, is quite similar to our work. They propose a scene interchange format, Molecular Scene Description Language (MSDL), for describing the appearance of a molecular structure, which encapsulates the methods and options re-

ing interactive 3D models of proteins. In our system, the structure information is obtained in the form of a Protein Data Bank (PDB) file [19]. The system was built using the Java3D API [22]. The system enables the users on different platforms/operating systems to join in a collaboration session. The users in such a collaboration session can view the same protein molecule synchronously and share ideas by annotating sub-structures of the molecule being examined. We have developed an extensive annotation mechanism, which enables the users to attach not only textual notes but also multimedia contents (e.g. audio, video) and actions to parts of the protein molecules. The remainder of this paper is organized as follows: The architecture of this distributed framework is introduced in Section 2. We explain how different 3D representations are visualized using PDB data in Section 3. In particular, In Subsection 3.3 we provide the details of the accompanying textual views, namely the Molecule Information Window and the Tree View Window. The tools provided for collaboration are explained in Section 4. Implementation details are presented in Section 5. Finally we summarize our work and discuss future research direction in Section 6. Related Work There has been a lot of research on collaborative visualization. Most of them have not particularly targeted the area of protein visualization, but they could be applicable to this area. CEV [4], a collaborative environment for visualization, is one of those studies. The problems they attack in their work are mainly about efficiency. By separating the 2D and 3D components of a visualization task between the servers and the clients and shifting more processing intensive tasks to high-powered servers, they make it possible for low processing powered clients to participate in visualization. CollIde [5] is a distributed computer aided design (CAD) environment, in which the users create and view 3D models collaboratively. It is developed as a plug-in to an existing stand-alone 3D modeling tool. Shastra [6] is a collaborative modeling and visualization tool, which allows the design, simulation and prototyping of synthetic environments. I3D [8] is another example of a collaborative visualization application, which provides exploration of annotated 3D environments. The users of I3D are able 2

from a client, and broadcast the view changes to all other clients. Hence, the session server acts as the centralized coordinator between the clients and as a message forwarding agent. A proxy acts as message forwarding agent between a client and the session server. As the session server reports view updates, it sends notification messages to all of the clients via their proxies. Each proxy maintains a reliable unicast connection with the session server and with its client. In this way, messages can be sent reliably between the clients and the session server. Besides forwarding messages between the session server and the clients, the proxy also broadcasts messages over a multicast group. Thus the proxy can forward messages from the client either reliably using reliable unicast or in a more scalable manner using multicast communication. The need for both a reliable TCP/IP connection and an unreliable (best effort) multicast communication channel among the clients and the server is due to different reliability requirements in passing messages. Messages, such as requests to join and leave a collaboration session, to create and delete objects, to obtain an exclusive write lock of an object, etc., are session critical (message loss in such cases can leave clients in inconsistent states). Hence, these messages, which are relatively few, are routed through the reliable TCP/IP channels. For collaborative 3D modeling and visualization, often times a user might want to show other users a particular aspect of a design or an animation (e.g., a fly-through or a walk-through). An activity like this will generate a large number of messages from a client to continuously update the pose and location of the viewer and/or the objects. These messages, which are many, can easily inundate a slow communication channel. However, they do tolerate a certain degree of loss (e.g., animation will become jerky if a large number of fly-through poses are lost but is otherwise viewable). Hence, these types of messages are sent via best-effort multi-cast channels. The system can operate in one of the following two modes: peer-to-peer and master-slave. The former allows multiple users to collaboratively visualize a 3D model. In this mode, an individual user who would like to edit a particular component of the shared model must first request an exclusive lock on that component from the server. The lock is released

quired to visualize a protein molecule. It promises to bridge the gap between the VRML format intended for visualization and the PDB format intended for recording domain knowledge. However, they provide a simpler annotation mechanism compared to our system. The main difference between our collaborative visualization system and the related work surveyed above is in our distributed visualization model and the powerful annotation mechanism we provide. The users of our system are able to create variety of annotations, which turns PDB files into active information sources. An annotation can contain either passive notes or active actions, say, to perform a particular folding algorithm on the molecule or to automatically invoke an appropriate display model. In summary, we focus on distributed visualization and collaboration targeted to protein visualization.

2

System Architecture

Our system comprises three major modules; the clients, the proxies, and the session server. A client is the 3D protein visualization application written in Java. This client can be downloaded via a Web server. Each client establishes a TCP/IP connection to a unique proxy The proxy runs on the machine from which the application is down loaded. Each proxy establishes a TCP/IP connection to the session server as well as to a multicast group. The server accepts TCP/IP connections from each proxy and listens to the multicast group for messages broadcast between the proxies. The session server provides coordination between the clients through their proxies.

Figure 1: The system architecture Figure 1 shows the system architecture with two clients. The session server coordinates the message passing between the clients. It supports a locking protocol for ensuring mutual exclusive access to each primitive in the shared 3D model of the protein molecule. It also handles view updates request 3

about how amino acids are connected to each other, i.e. how the chain is formed. We describe below different 3D models provided by our visualization system, explain their use and the way they are built. 3.2.1 Backbone Model The backbone model is created by using the carbon alpha, carbon, and nitrogen atoms in the molecule. The position of the atoms are used to transform the spheres that represent them. The bonds within each amino acid and the peptide bonds (between amino acids) are also shown in the model. This model is useful for understanding the protein molecule as a chain, and realizing amino acids’ positions in this chain.

after the operation is completed. In the master-slave mode, only a single master user is allowed to change the model and this mode is primarily intended for class teaching and remote learning applications.

3

Visualization

In this section, we breifly discuss the protein data currently used in our system and describe four different 3D models for the data. We also present two methods of viewing hierarchical organization of the protein data. 3.1 Data PDB files are obtained from the Protein Data Bank (PDB) [19], which is an archive of experimentally determined 3D structures of biological macromolecules, serving a global community of researchers, educators, and students. PDB files contain atomic coordinates, bibliographic citations, primary and secondary structure information, and X-ray and NMR experimental data. PDB files are divided up into a number of sections and each section contains different records. Our system uses some of these sections. The HEADER record in the Title section is used to obtain the molecule name. The Secondary Structure section describes helices, sheets, and turns in a protein, but the information may be missing in many of the PDB files. We implemented the Kabsch-Sander [21] algorithm to compute the secondary structure information if such information is missing in a PDB file. The Coordinate section contains information about atomic coordinates, which is key to our visualization system. The ATOM records in this section presents the atomic coordinates for the atoms making up the protein molecule. We provide both 3D and 2D conceptualizations of the protein molecule that is examined. Each emphasizes a different property of the protein molecule. Transition from one 3D representation to another is accomplished using a distance based switching mechanism. The user is also able to select any particular 3D representation any time regardless the distance between the view point and the 3D object. 3.2 3D Representations Each different representation of a protein molecule highlights a different aspect of the structure. They have advantages and disadvantages compared to each other. For example, the space-fill model can be helpful in understanding the volume a protein molecule occupies, but it lacks information

Figure 2: Backbone model Figure 2 shows the backbone model of the protein molecule Acetylcholine Receptor (PDB code: 1A11). When we interact with the 3D model of the backbone of a molecule, we can easily realize how the amino acid sequence is formed in a twisted chain (helix) shape. We used different colors for each amino acid to distinguish them in the molecule. The green orthogonal lines are the x, y and z axes (z axis being towards the viewer), which are helpful for positioning oneself in the 3D display. The gray sphere at the intersection of those axes indicates the origin. 3.2.2 Balls and Sticks Model The balls & sticks model shows all of the existing bonds in the molecule as sticks and all the atoms as 4

spheres.

Figure 4: Space-fill model. Figure 3: Balls and Sticks mode.

using the program called Molecular Scene Generator [17] in the form of a VRML file. It is loaded as a Java3D scene by using a VRML loader for Java3D [23]. Figure 5 shows the ribbon model of the same molecule Acetylcholine Receptor. Here, different colors for different secondary structures are used. 3.3 Textual Information Windows Having a textual representation of the protein molecule has many benefits. First of all it shows the linearity of the protein structure. The name of amino acids forming the chain is provided in a sequence view. Furthermore, the underlying hierarchy of the molecule can be captured when a tree view is used. 3.3.1 Molecule Information Window The molecule information window contains information about molecule’s name, number of amino acids it contains, the amino acid chain, the secondary structure information, and the information about currently selected sub-structure. The amino acid chain is displayed using one-letter representations of the amino acids. The molecule name info is read from the PDB file. Although it is possible to gather secondary structure information also from the PDB file, because of the fact that most of the PDB files available do not contain that information, the secondary structure information is calculated by using the prediction algorithm developed by Kabsch and Sander [21]. The information about the secondary structure

Figure 3 shows the balls & sticks model of the same molecule Acetylcholine Receptor (PDB code: 1A11). Again we used different colors for different amino acids to distinguish them in the molecule. 3.2.3 Space-fill Model The space-fill model is useful in visualizing the volume a protein molecule occupies. It gives an overall view of the molecule and thus provides a good view of the tertiary structure. In this model each atom is modeled using its van der Waals radius, so that the viewer gets an idea of relative sizes of the atoms making up the protein molecule. The atoms are represented by concrete spheres centered at the corresponding atomic coordinates read from the PDB file. Figure 4 shows a space-fill model of the same molecule Acetylcholine Receptor (PDB code: 1A11). We used different colors for atoms to distinguish them in the molecule. 3.2.4 Ribbon Model The ribbon model is used to display the secondary structures in the protein molecule. The secondary structure is predicted from the atomic coordinates in the PDB file, by using the algorithm [21] developed by Kabsch and Sander. In our current version this secondary structure information, found using the Kabsch&Sander algorithm, is used only in the information window. The ribbon model is acquired by 5

protein molecules. A protein molecule is composed of one or more chains of amino acids. A chain may contain several amino acids, probably in the order of hundreds. Each amino acid has an eight atom body and a side chain, i.e. residue, which may be made up of 1 to 18 atoms. We provide a tree view window that visualizes this hierarchical structure of a protein molecule.

Figure 7: Tree view of the primary structure.

Figure 5: Ribbon mode.

Figure 7 shows the tree view window. In this snapshot the molecule has a very simple hierarchy, since it contains only one chain. But it is still useful to understand the linearity of the protein molecule. We provided a two-way interaction between the tree view and the 3D view. The user can interact with the tree by selecting its nodes. The corresponding substructure is highlighted in the 3D model. When the interaction is with the 3D model, and if a selection is made on it, the corresponding tree node is highlighted accordingly.

is also displayed using one letter codes aligned with the amino acid codes (S : Sheet, H : Helix, T : Turn). When the user makes selections on the molecule during the interaction with the 3D model, the corresponding part of the amino acid chain in the information window is highlighted. If the selection is in the level of atoms, the selected atom information is also displayed in the information window.

Figure 6: Molecule Information Window. Figure 6 shows the molecule information window during interaction with the Serine Protease Inhibitor (PDB ID: 1AML) protein. The current selected amino acid is Valine, whose one letter code is V, and it is the 12th amino acid in the first (and only) chain of the protein molecule. We see in the secondary structure information that this amino acid is part of a turn, and currently selected atom is carbon. 3.3.2 Tree View Although a protein is a linear structure of amino acids, there’s a hierarchy in the primary structure of

Figure 8: Tree view of the secondary structure. The tree view window has another mode, which is for secondary structure information (Figure 8). The chain level in the primary structure hierarchy is replaced with the secondary structure information in this mode. Again one letter codes are used for dis6

they can be created and viewed in a distributed visualization process, and can be stored and sent to other people via e-mails. Our sticky notes system has the following functionalities: (1) a note can be used as an information container, (2) a note may contain an action (a piece of program) that can be executed by the user on-demand, (3) a note may monitor the environment and automatically execute certain operations under user specified conditions. In more detail, a note may contain text or multimedia information. Furthermore, action notes store operations that can be invoked by the user. Action notes can remember repetitive tasks and execute those on the user’s command. Active notes, on the other hand, provide a mechanism to store operations along with the conditions to invoke the operations. The conditions are automatically monitored by the active note without the user’s intervention, and can be external events or a particular configuration in the 3D protein model.

play of the secondary structures. This view is a helpful tool for making an association between the primary structure and the secondary structure. 3.4 Interaction The user interacts with the molecule using the mouse. The user can manipulate and examine the molecule’s structure from any angle. This is a natural way to examine the structure. The ability to rotate, zoom, and pan the scene allows an observer to easily understand the 3D structure of the protein. Parts of the molecule can be selected from both the 3D view and the Tree View, and the appropriate information is shown in the Molecule Information Window. The user can annotate parts of the molecule, amino acids for example, by using the notes system to draw particular attention to that part or to share an idea about that part. Since the 3D representation of the molecule is hierarchical, the users can interact with the molecule in different granularity, by selecting atoms, groups, or chains.

4

Collaboration

We provide synchronous collaboration in the form of sharing views and annotation of different parts of the molecule. Annotation also provides a mechanism for asynchronous collaboration, since they can be stored with the protein molecule and examined later. We explain the mechanisms for collaboration in the following sections. 4.1 Sharing the View Several clients in distant locations are able to join in a collaboration session and view a particular protein molecule synchronously. As mentioned before, the system can operate in one of the following two modes: peer-to-peer and master-slave. Peer-to-peer mode gives each user the ability to change the common view of the 3D model. In this mode, an individual user who would like to examine or annotate a particular component of the shared protein model must first request an exclusive lock on that component from the server. In the master-slave mode, only a single master user is allowed to change the common view of the protein model. However, other clients are still able to change their local views. This mode is primarily intended for class teaching and remote learning applications. 4.2 Sticky Notes Sticky notes is an annotation system, which provide a mechanism for collaboration. The collaboration is both synchronous and asynchronous, since

Figure 9: Two clients in a session. In our system, sticky notes are globally visible in a collaboration session and can be updated or deleted by distributed clients, thus constituting a good communication tool. They can also serve as a reminder or a method of documentation for the protein molecule. The notes can be used to draw attention to a particular sub-structure in the molecule. A snapshot of a collaboration session with two clients is shown in Figure 9. The clients can annotate the protein molecules with sticky notes by selecting parts of the model. The notes are visible to both clients as icons attached to the 3D components. Ownerships of atoms are indicated by different col7

cause people sometimes prefer saying things instead of writing them down. 4.2.3 Action Notes Text and audio sticky notes contain data that are entered by the user, whereas an action note is a more advanced type of sticky note. A novel feature of an action sticky note is that it can also store programs (or actions to be performed). The actions can be easily activated on-demand at a later time. This is convenient for executing certain repetitive tasks. In the current design, we provide several predefined actions. An example predefined action is to search and display the summary information of a specified protein molecule. The action takes the PDB ID of the molecule as the input parameter and it searches the Protein Data Bank web site and displays the summary information in the extension panel as shown in Figure 11.

ors. Below, we explain in detail different types of notes that we have developed. 4.2.1 Text Notes Text sticky notes allow the user write his/her ideas in a textual form. This information is saved under the sticky notes with other useful information such as author, subject, date, and is shown as a billboard icon attached to the atom. The graphical user interface for creating a text note is shown in Figure 10.

Figure 10: The GUI for creating a note. The user interface has the following three panels: a text input panel on the left to fill in the subject, author, and message fields, and select an icon for display; a preview panel in the middle that includes a graphical representation of the note and the 3D object it attached to for showing how the note will be seen with the model; and a command panel on the right that contains the control buttons (e.g., save, preview). An interesting issue about visualizing sticky notes in a 3D virtual environment is that they can be viewed from many different perspectives. In order to make them legible all the time during interaction, we have designed the sticky notes as icons with an appropriate size always oriented to the viewer. The text notes can be useful when the user wants to add more information about the protein molecule other than that is included in the PDB file. An example may be the information about the details of the protein exploration methods. These notes can be used also as chat boards during a collaborations session. 4.2.2 Audio Notes The second type of sticky notes in our system allows multimedia information to be stored. Currently, we have an audio note implementation, and extending to handle video data should be straightforward. Audio notes allow the user to record an audio message and attach it to a particular component. Other users can play the note and add their own messages to it. This is also a good way for collaboration be-

Figure 11: A predefined action for searching information on the PDB site Another useful predefined action note is a “best view” action note. This note, when activated, rotates the 3D model to a predefined pose, which best elucidates a certain property of the protein molecule. This action may be helpful because viewing a molecule from a certain direction may make some protein property explicit, which might be hard to notice when the molecule is viewed from a different direction. Consider the Acetylcholine Receptor (PDB ID:1a11) molecule in Figure 12. The distinguishing part of this molecule is the main helix structure. The helix is realized only when the molecule is viewed from the side. With the best view note, the user can activate the note and be directed to a desired 8

a protein molecule has exceeded a certain threshold. This functionality may prove useful especially during a collaborative protein modeling session, in which the structure of a protein is being investigated by a group of users. Since the “modeling” functionality is among future work, the user-defined condition explained above is not yet implemented. However, we’ve implemented an example active note, which monitors the distance between the viewing point and the center of the protein molecule. Figure 13 shows such an active note, which updates the displayed subject of the note as the distance changes. The action for this note is defined as changing the 3D model from balls-sticks model (left) to backbone model (right). The action is triggered when view point becomes far enough, i.e. more than 90 units. The snapshot on the left shows the active note when the viewer is 10 units away from the protein. The one on the right is the same molecule, but this time viewed from a distance of 91 units, thus activating the note to switch the 3D model to backbone model.

view easily. Figure 12 shows two different backbone views of the same molecule, Acetylcholine Receptor. The one on the right reveals better the helix structure.

Figure 12: ”Best View” Action Other useful predefined actions include querying a certain Internet archive for relevant information and models, popping up a window for the user to input needed information, generating another note, invoking an external application, etc. We note here that the user-defined action behavior in our system was not extensively studied in the previous systems for annotations. 4.2.4 Active Notes An active sticky note is similar to an action note in that it stores an action to be performed. However, in an active note, a condition can be specified which triggers the action. The action is invoked automatically if the specified condition becomes true. To accomplish this, an active note monitors the environment for the specified condition. One type of active notes we developed in our system is the automated levels-of-detail display option. Such levels-of-detail active note will automatically compute the distance between the camera and the 3D object with which the note is associated. During a walk- or fly-through scenario where the distance to the object changes continuously, the note will monitor the change in distance and alerts the underlying modeling system to select an appropriate display model. For example, the user can create an active note to change the protein model from space-fill to balls-sticks once it is within a certain distance from the camera. Other conditions for active notes can be time related or even user defined. Time related conditions are similar to an alarm clock. They can be one-time only events or events which repeat on a periodical basis. One example of user-defined conditions can be whether the force fields in a particular area of

Figure 13: An Example Active Note 4.3 Locking Mechanism A locking mechanism is needed in order to synchronize users’ requests for creating and attaching annotations to parts of proteins. We use a very simple locking protocol. The user who wants the ownership clicks on the desired part (atom or residue) of the molecule. If it is not owned by anyone, the server grants the user the ownership of the sub-structure. The ownership information is displayed by using colors. If the user owns the sub-structure it is colored red, if it is owned by some other client it is colored blue, and if it is not owned by anyone the original color of the molecule is displayed. If the user wants to take the ownership of a component owned by some other client, he/she clicks on it, then the component in the owner’s screen starts to blink, if the owner is still interested in that component he/she has to click on it to continue the ownership, if he/she is not interested, the component is released to be 9

note, and use Java3D Billboard behavior to solve the rotation problem. Billboard behavior operates on a TransformGroup node to cause the local +z axis of the TransformGroup to point at the viewer, thus making note discernable. It is important to inoculate the note from the underlying object rotation. We have experimented our system on our local area network. The performance of Java3D is acceptable when the protein molecules are of size 50 to 150 amino acids. This acceptable range covers a very small subset of proteins in the Protein Data Bank, however the range can be expanded by increasing the capacity of the hardware resources available.

owned by other after a timer expires. Then the user, who wants to take the ownership can be the owner by re-clicking on the sub-structure that is now released.

5

Implementation

We have chosen Java3D for representation and visualization of 3D models of protein molecules. Java3D is a general 3D graphics rendering and modeling API [22]. It provides a high-level objectoriented programming paradigm that greatly reduces the implementation time. Java3D API also has several layers of support for 3D model construction from low-level primitives to basic shapes. By using Java3D, structure information of a protein molecule can be easily constructed by reading protein information from a protein description file such as PDB (Protein Data Bank). Or another 3D representation of the same molecule can be converted to Java3D if an appropriate loader for that 3D format is provided (e.g. VRML Loader). A scene graph is created after reading the molecule information from the PDB file. A scene graph consists of Java 3D objects, called nodes, arranged in a tree structure. The user creates one or more scene subgraphs and attaches them to a virtual universe. The individual connections between Java 3D nodes always represent a directed relationship: parent to child. Java 3D restricts scene graphs in one major way: Scene graphs may not contain cycles. Thus, a Java 3D scene graph is a directed acyclic graph (DAG). Java 3D refines the Node object class into two subclasses: Group and Leaf node objects. Group node objects group together one or more child nodes. A group node can point to zero or more children but can have only one parent. Leaf node objects contain the actual definitions of shapes (geometry), lights, sounds, and so forth. A leaf node has no children and only one parent. In our implementation, amino acids and chains are represented by group nodes, and atoms are represented by leaf nodes. The scene graph is built in such a way that it maintains the hierarchy present in the protein molecule data. We have built a custom package that includes classes to construct, manipulate, and manage sticky notes. Classes are designed to provide the following functions: (a) create, delete, update sticky notes; (b) send/receive sticky notes over the net; (c) handle events; (d) handle plug-in actions. We have created a Java3D BranchGroup object for each sticky

6

Conclusions and Future Work

We have developed a collaborative protein visualization application. Geographically distributed clients are able to view the same protein molecule interactively at the same time, and they can share their ideas by using four types of sticky notes. The protein structure is read from a PDB (Protein Data Bank) file and corresponding 3D structures for this protein molecule is created as Java3D scene graphs. Our major contribution lies in moving beyond stand-alone visualization to provide distributed visualization with both peer-to-peer and master-slave interaction and extensive collaboration and annotation facility. We have observed that the introduction of an advanced annotation tool greatly facilitated collaboration. The protein molecule can be displayed in many other types of 3D structures, such as electron density map and solid surface representation. We are currently in the process of adding these models into our system. Incorporating a protein folding algorithm into our visualization system will enable the users not only to visualize proteins of unknown structure, but also to model and create new proteins on the fly by changing the amino acid sequence. Acknowledgements We thank Ming Li and Huandong Sun for their collaboration. This work is supported in part by NSF grants IIS-9817432, IIS-9908441, and IIS-0101134, and by a grant from the Center of Information Technology and Society at the University of California at Santa Barbara.

References [1] Human Genome Project, http://www.ornl.gov/hgmis/.

10

[2] Clay Shirky, “Seven Ways of Looking at a Protein”, FEED Magazine, After Darwin Column, 23 Oct, 2000.

[17] J.G. Tate, J. Moreland, P.E. Bourne, “MSG (Molecular Scene Generator): a Web-based application for the visualization of macromolecular structures”, Journal of Applied Crystallography, 1999, 32, pp.1027-1028.

[3] S. Greenberg, M. Roseman, “Groupware Toolkits for Synchronous Work”, Technical Report 96-58909, Department of Computer Science, University of Calgary, 1996.

[18] D. Brown, “Deciphering The Message of Life’s Assembly”, Washington Post, 1997, http://www.people.virginia.edu/rjh9u/protfold.html.

[4] R. Raje, M. Boyles, and S. Fang, “CEV: Collaborative Environment for Visualization Using JavaRMI”, ACM Workshop on Java for Science and Engineering Computation, 1998.

[19] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, P.E. Bourne, “The Protein Data Bank”, Nucleic Acids Research, 28 pp. 235-242, 2000.

[5] T.J. Nam, D.K. Wright, “CollIDE: A shared 3D workspace for CAD”, In Proceedings of the 4th International Conference on Networking Entities, NETIES ’98, Leeds, UK, 1998.

[20] Protein Data Bank, Nature of 3D Structural Data, http://www.rcsb.org/pdb/experimental methods.html [21] W. Kabsch and C. Sander, “Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features,” Biopolymers, 22:2577, 1983.

[6] C. Bajaj, S. Cutchin, “Web Based Collaboration Aware Synthetic Environments”, Proceedings of TeamCAD Gvu/Nist Workshop on Collaborative Design, pg 143- 150, May 1997.

[22] Java3D Application Programming Interface, http://java.sun.com/products/java-media/3D/.

[7] Iris Annotator, http://www.sgi.com/software/annotator/.

[23] VRML97 Specification, ISO/IEC 14772-1:1997, http://www.web3d.org/Specifications/VRML97/.

[8] J. F. Balaguer, E. Gobbetti, “i3D: An interactive system for exploring annotated 3D environments”, International Symposium on Scientific Visualization, Chia, Italy, August 1995. [9] Protein Explorer, RasMol and Chime, http://www.umass.edu/microbio/rasmol/. [10] R. Koradi, M. Billeter, K. W¨uthrich, “MOLMOL: a program for display and analysis of macromolecular structures”, J Mol Graphics, 14, 51-55, 1996. [11] Molvie (Molecule Visual and Interactive Environment), http://guanine.cs.ucsb.edu/Molvie/. [12] N. Guex and M. C. Peitsh, “SWISS-MODEL and Swiss-PdbViewer: an environment for comparative modeling.” Electrophoresis, pages 2714-2723, 1997. [13] D. Walther, “WebMol - a Java based PDB viewer”, Trends Biochem Sci, 22: 274-275, 1997, http://www.embl-heidelberg.de/cgi/viewer.pl. [14] Java3D Molecular Visualisation System, http://www.adcworks.com/projects/jmvs/. [15] W. F. Humphrey, A. Dalke, and K. Schulten, “VMD - Visual Molecular Dynamics”, Journal of Molecular Graphics, 14:33-38, 1996. [16] P.E. Bourne, M. Gribskov, G. Johnson, J. Moreland, and H. Weissig, “A Prototype Molecular Interactive Collaborative Environment (MICE)”, Pacific Symposium on Biocomputing, 1998, pp.118-129.

11