* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Grand challenges in bioinformatics.
Magnesium transporter wikipedia , lookup
Genetic code wikipedia , lookup
Multi-state modeling of biomolecules wikipedia , lookup
Cell-penetrating peptide wikipedia , lookup
Silencer (genetics) wikipedia , lookup
History of molecular evolution wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Gene regulatory network wikipedia , lookup
Gene expression wikipedia , lookup
Biochemistry wikipedia , lookup
Western blot wikipedia , lookup
Protein moonlighting wikipedia , lookup
Genome evolution wikipedia , lookup
Protein folding wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
Protein adsorption wikipedia , lookup
Proteolysis wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Homology modeling wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein structure prediction wikipedia , lookup
Molecular evolution wikipedia , lookup
BIOINFORMATICS Editorial Grand challenges in bioinformatics The protein folding problem has been one of the grand challenges in computational molecular biology. The problem is to predict the native three-dimensional structure of a protein from its amino acid sequence. It is widely believed that the amino acid sequence contains all the necessary information to make up the correct three-dimensional structure, since the protein folding is apparently thermodynamically determined; namely, given a proper environment, a protein would fold up spontaneously. This is called Anfinsen’s thermodynamic principle. While this principle is well established in selected proteins under in vitro experimental conditions, protein folding in vivo is a more complex and dynamic process involving a number of other molecules such as chaperones. The environment has to be considered as a collection of various interactions with molecules rather than a smooth thermodynamic environment. It is not unreasonable to expect that the protein folding problem cannot be solved for the majority of proteins in nature without considering specific molecular interactions. This is reminiscent of the problem of secondary structure prediction in proteins. However good the algorithms developed for secondary structure prediction are, the success rate will be limited as long as only the short-range interactions are considered. Similarly, however good the algorithms developed for the three-dimensional structure prediction are, the success rate will be limited as long as only the information of a single molecule is examined. In the era of whole-genome sequencing, we are faced with another grand challenge problem, which may be called the organism reconstruction problem. Given a complete genome sequence, the problem is to predict computationally the development of the adult from a single cell and its continual function as a biological organism. Here again, a traditional view is that the genome is a blueprint of life containing all the necessary information that would make up an organism. A clone can be made by replacing the nucleus, which is the Oxford University Press localized area containing all genetic information. Thus, this might be called Dolly’s cloning principle. According to this genetic determinism principle, we should eventually be able to predict the function of every gene in the genome by its sequence information alone. Implicitly, this assumes that the environment of each gene is also computable from the complete genome sequence because the function of a molecule can only become meaningful in relation to its environment. Therefore, the entire molecular architectures and molecular reaction pathways in a germ cell, for example, may be computable from the genomic sequence. We thus end up asserting that the form and function of an organism are represented in the nucleus. In an alternative view, the genome is simply a warehouse of parts, or building blocks of life, and a real blueprint of life is written in the entire cell, perhaps as a network of molecular interactions. Whichever view one takes, it is impossible in practice to make sense fully out of the sequence data without additional information, including time and localization of expression and, especially, the information on molecular interactions. In fact, in order to obtain any functional clue of hypothetical proteins that still form one-third to one-half of the genes in every genome that has been sequenced, new systematic experiments are being designed to observe, for example, gene–gene interactions by disruption experiments and protein–protein interactions by yeast two-hybrid system experiments. Bioinformatics has emerged as a major discipline due to the rapid increase in sequence information, developing new databases and computational technologies that help us to understand the biological meaning encoded in the sequence data. In a post-genomic era of systematic functional analysis, the basis of bioinformatics is not only the complete catalogue of building blocks, but also the complete catalogue of their interactions. With this new level of information, the grand challenge problems in bioinformatics, both old and new, and both structural and functional, may one day be elucidated, although not in the manner in which they were originally formulated. Minoru Kanehisa 309