Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MDSA - An interactive analysis tool for Protein Molecular Dynamic Simulations: Preliminary Study 1 2 Nurul Adilah Abu Bakar and Siti Zaiton Mohd Hashim , Mohd Shahir Shamsir Omar 1 Software Engineering Department, Fac. of Computer Science & Info. Systems, Universiti Teknologi Malaysia, 81310 UTM Skudai, Johor, Malaysia e-mail: [email protected] 2 Biology Science Department Faculty of Biosciences and Bioengineering, Universiti Teknologi Malaysia, 81310 UTM Skudai, Johor, Malaysia e-mail: [email protected] Abstract Molecular dynamics simulation has become an important tool for predicting and studying behavior of real materials. Utilisation of molecular dynamics simulations enabled researchers to not only decreases the amount of time to complete a project, but also potentially decrease the cost of research. One of the most exciting and difficult challenges in biology is to understand the interactions of complex intractable biological systems. Hence, computational simulations have become increasingly important in enabling rapid progress in biological research. Traditionally, many molecular dynamics simulation software posses command line interface. Implementation of a graphical user interface to front end of programs can simplifly and facilitate molecular dynamics research compared to the use of command line as programs input, which is gradually becomes more difficult for pure biologist as the simulation complexity increases. This motivated us to developed MDSA (Molecular Dynamics Simulation and Analysis), a new strictly molecular dynamics simulation program written in JAVA programming language and LINUX operating system. It offers three-dimensional interaction and perception. In this paper, we present the preliminary work of this research i.e. the review on softwares that have been accomplished. At the end of this paper, we also present the future direction of our research that will be done. Keywords: Molecular Dynamics Simulation and Analysis (MDSA), bioinformatics, Graphical User Interface (GUI) 1. Introduction One way to understand the motion in materials is to use computer simulations method called molecular dynamics. Molecular Dynamics (MD) is a science of simulating the motions of a system of particles and their development with time [1]. These motions are crucial both in providing useful insights into motion dependent phenomena and in the determination and refinement of macromolecular structures [2]. MD is now one of the principal tools in the theoretical study of biological molecules. This computational method calculates the time dependent behavior of a molecular system. [3]. The use of computer simulations for explorations of molecular details of biochemical reactions began in the mid 1970s with simulations of the initial step in vision and the high-frequency motions of small proteins in-vacuo. With advances in understanding of protein structure, more realistic simulations were achieved by solvating the system. MD allows the study of the behavior dynamics of a protein as opposed to examining a snapshot of the protein at a specific time. Atoms constantly are in motion because of thermal vibration. The structure of the macromolecule keeps the atoms in place and restricts their motion. With MD techniques, protein harmonics can be analysed and changes in protein conformation calculated. The increasing need for obtaining results faster has led to the development of numerous algorithms that can be used to simulate molecular dynamics. However, users often find that the widely used programs are often too complex and not intuitive enough to quickly allow training and generate ample interest in the field. 2. Literature Review Over the last ten years, many molecular dynamics simulation software for biological macromolecule have been developed. But, in this section, we will initially focus on software that lacks a graphical interface. 2.1 GROMACS: Groningen Machine for Chemical Simulations GROMACS is one of the fastest molecular dynamics (MD) software vailabe. It is an open-source software developed by Lindahl and co-workers [4]. GROMACS supports all the usual algorithms from a modern molecular dynamics implementation. It is mainly used to simulate the dynamics of biochemical such as proteins and lipids that have complex bonded and non-bonded interactions. GROMACS is primarily designed for biological macromolecules that possess complicated bonded interactions such as proteins and carbohydrates. GROMACS is widely used as it is a comprehensive suite of molecular dynamics simulation and analysis program. GROMACS was developed to run on UNIX based operating systems. The user makes use of the UNIX shell in order to run programs and make effective use of this molecular dynamics simulation package. GROMACS is considered one of the fastest programs for molecular dynamics simulation when benchmarks against AMBER and CHARMM. However GROMACS is difficult for untrained biologist to use them intuitively. The command line input hinder the uninitiated biologists who are only familiar with the ubiquitous point and click graphical interface. 2.2 GUIMACS: A Java based front-end for GROMACS GUIMACS, a Java-based front-end programs for the LINUX version of GROMACS. GUIMACS runs as a standalone application with Multiple Document Interface (MDI) and enables its user to run or analyze multiple molecular dynamics simulations simultaneously. Programs provided by GROMACS were divided into two groups. [5] First is GUISim a graphical interface that includes the user interface window which runs and manage a basic molecular dynamics simulation. All the six programs in GUISim were represented as tabs on Graphical User Interface(GUI) window where the user can choose the various tabs in any order. The second interface is the GUINalyzer, an interactive front end interface for programs used for MD analysis. Although these graphical features are available, the graphical user interface is very complex and difficult for untrained biologist to use effectively. Users must content with a blend of graphic and command line. Therefore, GUIMACS only provide a relatively simple interface for novice LINUX user but not friendly and intuitive for an unversed or amateur biologist. 2.3 FPV: Fast Protein Visualization Using Java 3D Protein visualization methods have also become an important research area. It is critical as visualizing structures are important in biological research. FPV is Fast Protein Visualization Using Java 3D designed to cater for this need. This software was developed based on Java 3D-API for protein visualization system. It provides the capability for applications to be run remotely through web browsers. Java 3D is a scene graph-based 3D application programming interface (API) for the Java platform. [7]. Java 3D's scene graph-based programming model provides a simple and flexible mechanism for representing and rendering scenes. The scene graph contains a complete description of the entire scene, or virtual universe. This includes the geometric data, the attribute information, and the viewing information needed to render the scene from a particular point of view.[a] In this software, this proposed techniques to create efficient scene graph structures, which allow loading large molecules (more than 4000 amino acids) and render them in an acceptable interactive speed. Using JAVA 3D as a graphic engine has the advantage because JAVA-3D API incorporates a high-level scene graph that allows developers to focus on the objects and the scene composition. FPV also presents techniques by comparing the visualization components of these systems with two other Java 3D based molecular simulation tools. For van der Waals display mode, with the efficient of the scene graph. A scene graph consists of Java 3D objects, called nodes, arranged in a tree structure. FPV could achieve up to eight times improvement in rendering speed and could load molecules three times as large as the previous system could. [8]. 2.4 JAVA technology in the fields of bioinformatics There is growing trend in adopting the JAVA technology in the fields of bioinformatics and computational biology [6]. Java has allowed bioinformatics users to rapidly develop user-friendly, cross-platform applications that are accessible to users at all levels of computational ability. Traditionally, the language of choice for bioinformaticians has been Perl. Perl allows the rapid collection and analysis of data to answer directed questions, Perl developers can quickly leverage the power of regular expressions and the large collection of bioinformatics-based modules. Furthermore, Perl allows users to rapidly prototype Internet-based methods for delivering data. However, the value of standalone bioinformatics applications created with Perl is limited in scope in its contribution back to research biologists. Perl scripts usually require prerequisite dependency installations and they lack the dynamic GUI interactions inherent in Java. For this reason, bioinformaticians have been using Java to deliver applications to researchers at all levels of computational ability, most of whom want to use computational approaches quickly to supplement other types of biological work. [9] Java also features cross-platform compatibility. By providing a means of mass viewing any simulation, namely Applet technology, Java increases this distribution even more. Applets are components that can be added to a webpage for easy viewing across the Internet by means of a web browser. In this way, biology researchers or students will be able to conduct the simulations even if they are on others country or other side. 2.5 iMolTalk: an interactive, internetbased protein structure analysis server. iMolTalk is a new and interactive web server for protein structure analysis. It addresses the need to identify and highlight biochemically important regions in protein structures. As input, the server requires only the four-digit Protein Data Bank (PDB) identifier, of an experimentally determined structure or a structure file in PDB format stemming e.g. from comparative modelling. iMolTalk offers a wide range of implemented tools (i) to extract general information from PDB files, such as generic header information or the sequence derived from three-dimensional co-ordinates; (ii) to map corresponding residues from sequence to structure; (iii) to search for contacts of residues (amino or nucleic acids) or heterogeneous groups to the protein, present cofactors and substrates; and (iv) to identify protein-protein interfaces between chains in a structure. The server provides results as userfriendly two-dimensional graphical representations and in textual format, ideal for further processing. At any time during the analysis, the user can choose, for the following step, from the set of implemented tools or submit his/her own script to the server to extend the functionality of iMolTalk. [10] 3. Discussion Molecular dynamics simulations are relatively inexpensive and powerful capabilities compared to other computational biology setup or instrumentation. However, most of them is difficult to use and offer only limited capabilities to the untrained biologist. The obstacle here is the lack of easy to use interface to facilitate quick training and usage of MD in biological research. GROMACS is chosen as it is open source, fast and cluster friendly characteristics. For GROMACS molecular dynamics software, all programs in GROMACS utilises a command line options for input and output files. Effective use of command line based programs requires basic knowledge of the UNIX Shell and at least on of the UNIX based text editors. User must type the command in a sequential step before acquiring the final analysis output. Although GROMACS is a very fast program for molecular dynamics simulation, it still difficult for pure biologist or researchers or novice LINUX to use them efficiently. To solve these problems we must developed one graphical user interface for novice LINUX especially pure biologist and researchers. The creation of GUIMACS as one of the Java based front-end for GROMACS did not solve the interface problems. GUIMACS eliminates the need for command line input but replaces this with a plethora of interactive checkboxes and radio buttons for users. These checkboxes and radio buttons is complicated as explanation for their respective role is extremely insufficient. To solve these problems, we try to reduce the number of these button and changed these radio-button and checkboxes with a-user customizable pop-up menu in graphics display windows with adequate description of the options available. We would also like to implement a sliding bar to examine the timestep of the simulation. This would allow a graphical visualization of the temporal evolution of the protein during the simulation. A 3D plot of the residue-time evolution with the root means square/fluctuation calculation is also desirable and would be implemented. Such implementation would be beneficial to biologists. 4. Conclusions Molecular dynamics simulation software (MDSA) is designed to act as a graphical front end for a molecular dynamics simulation. It provides intuitive interface for the untrained biologist and researchers. By implementing a graphical interface, it would rapidly enhance interest and accelerate research in the area of molecular dynamics. 5. Future Work The continuation of the reseach would be to continue with our study to further develop MDSA. Based on the analysis, a working prototype would be developed, which is design the three components of the user interface : the graphics display window, the graphical user interface windows and the MDSA command prompt. In the graphic display window, in which molecules are rendered and interactively rotated, translated and scaled via mouse controls. A popup menu would also be available. The graphical user interface (GUI) in MDSA would include a toolbar that provides access to the specific tasking form such as changing the current molecular display characteristic. The GUI windows would also provide useful textual information. First it shows the linearity of the protein structure. The name of amino acids forming the chain is provided in a sequence view. The textual information window also contains information about molecule’s name and the number of amino acid chain. The amino acids chain is displayed using one letter representations of the amino acids. When the user makes selection on the molecule during the interaction, the corresponding part of the amino acids chain in the information window is highlighted. We plan to include a MDSA command prompt to provide a command prompt keyboard input and control for advanced users. 6. References [1] Karplus, M. and McCammon, J.A. (2002). Molecular dynamics simulation of biomolecules. Natl. Acad. Sci. 85,7557-7561. [2] Karplus, M. and Petsko, G. A. (1990). Molecular dynamics simulations in biology. Nature 347, 631-639. [3] Elber, R. and Karplus, M. (1987). Multiple conformational states of proteins: a molecular dynamics analysis of myoglobin. Science 235, 318-321. [4] http://www.ch.embnet.org/MD_tutorial/ pages/MD.Part1.html [5] David van der spoel, Erik Lindah1, Berk Hess, Gerrit Groenhof, Alan E. Mark, Herman J.C. Berendsen. Gromacs: Fast, Flexible, And Free, 2005. [6] Pradeep Kota, Guimacs - A Java based front end for Gromacs, Silico Biology 7, 0008, 2006 [7] http://en.wikipedia.org/wiki/Java_3D [8] C. Tolga, W. Yujun, W. Yuan-Fang, S. Jianwen. FPV: Fast protein Visualization Using Java 3D, 2003. [9] Mohd Shahir Shamsir, Huszalina Hussein, Siti Zaiton Mohd Hashim and Naomie Salim. Educating the educators: Incorporating bioinformatics into biological science education in Malaysia. [10] Diemand, AV.; Scheib, H. iMolTalk: an interactive, internet-based protein structure analyis server. Nucl Acids Res. [11] S. Meloan, “Exploring The New Frontier: Java TM Technology Powers the “Post-Genomic” Era”, Feature Stories, java.sun.com, September 28, 2001. [12] http://www.onjava.com/pub/a/onjava/2003/ 09/24/java_bioinformatics.html [13] http://www.jdocs.com/extradocs/126/javax. media.j3d/doc-files/intro.html [14] Stephen Smith, Java and Bioinformatics (for programmers and non-programmers). 2005 [15] R. A. Sayle and E. J.Milner-White, “RASMOL: biomolecular graphics for all”, Trends in Biochemical Sciences, 20(9):374, Sep 1995. [16] P. J. Kraulis, “MOLSCRIPT: A Program to Produce Both Detailed and Schematic Plots of Protein Structures”, Journal of Applied Crystallography, vol. 24, pp. 946-950, 1991. [17] R. Koradi, M. Billeter, and K. W¨uthrich, “MOLMOL: a program for display and analysis of macromolecular structures”, J Mol Graphics, 14, 51-55, 1996. [18] W. F. Humphrey, A. Dalke, and K. Schulten, “VMD – Visual Molecular Dynamics”, Journal of Molecular Graphics, 14:33-38, 1996.