Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Gene nomenclature wikipedia , lookup
Genome (book) wikipedia , lookup
Point mutation wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Genome evolution wikipedia , lookup
Gene expression profiling wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Minimal genome wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
COMPUTATIONAL PROTEOMICS DEVELOPMENT OF A STRATEGY FOR COMPUTER-ASSISTED SEARCHING FOR FUNCTIONALLY SIMILAR PROTEINS IN EVOLUTIONARILY DISTANT ORGANISMS * Bogdanov Yu.F., Dadashev S.Ya., Grishaeva T.M. Vavilov Institute of General Genetics, RAS, Moscow, e-mail: [email protected] *Corresponding author Key words: databases, knowledge bases, computer analysis, functional proteomics, virtual cell Abstract Motivation: Synaptonemal complex (SC), an universal ultrastructure that ensures the successful pairing and recombination of homologous chromosomes during meiosis in evolutionarily distant organisms, is build of non-homologous proteins. We aimed on developing a method of searching databases for genes that code for such non-homologous but functionally analogous proteins. Results: Advantage was taken of the ultrastructural parameters of SC and the conformation of SC proteins responsible for these. Using data from literature, we found a highly significant correlation (r=0.97; P< 0.001) between the width of the SC central space and the length of the alpha-helix in the central domain of yeast normal and deleted Zip1p and mammalian SCP1, intermediate proteins that form transversal filaments in SC central space. Basing on this, we found the Drosophila melanogaster CG17604 gene whose virtual protein meets the correlation requirement. Our finding has received experimental support in another lab. With the same strategy, we showed that the Arabidopsis thaliana and Caenorhabditis elegans genomes contain unique genes coding for proteins that also fit the above requirements. Availability: Bogdanov et al., 2002a, b. Introduction Genome databases have accumulated and processed the data on the complete genome sequences of model eukaryotic organisms: yeast S. cerevisiae, nematode C. elegans, fruit fly D. melanogaster, and plant A. thaliana. These organisms have been found to possess several hundred of orthologous genes and proteins, which are similar in primary structure and play a common role. However, evolutionarily distant organisms have such organelles as kinetochores, cell centers, synaptonemal complexes (SCs), etc., which partly or completely differ in ultrastructure notwithstanding their common function. In many cases, these structures are build of different structural proteins. To search for such functional analogs, we developed a strategy that combines computer analysis of the conformation and other physical-chemical properties of proteins with ultrastructural parameters of cellular organelles in situ obtained by electron microscopy. The transversal filaments (TFs) in the central space of the SC are responsible for chromosome synapsis (Zickler, Kleckner, 1999). Mammalian and yeast proteins that form TFs, SCP1 and Zip1p respectively, have been isolated and studied (Heyting, 1996; Dong, Roeder, 2000). These proteins, being non-homologous, are similarly organized and include three domains, with the central one possessing an extended alpha-helix. Both proteins are classed with intermediate proteins. In vitro, each protein forms road-shaped dimers of two similarly oriented parallel molecules (Heyting, 1996). The dimers resemble tooth-like halves of zipper-like connections in SC central space, i.e. TFs (Fig.). In addition, SCP1 and Zip1p share other physical-chemical properties of the entire molecule and of its individual domains. Their analogy can be extended to the ultrastructural level, since SC is structurally similar in yeast and in mammals, central space width, being about 100 nm. Basing on these data, we carried out a computer search for proteins forming TFs in D. melanogaster, A. thaliana, and C. elegans. Fig. Scheme of synaptonemal complex in mammals .and yeast Ch - Chromatin loops; LE - lateral element; TF - transversal filament; C and N -- terminal domains of SCP1 or Zip1p, respectively; CE - central element; doted line -- crossover DNA. 84 BGRS’ 2002 Method As resources, we used databases on the known and putative genes and proteins in S. cerevisiae, D. melanogaster, A. thaliana, and C. elegans. provided by NCBI (http://www.ncbi.nlm.nih.gov/)..Additionally for A. thaliana and С. elegans, databases of TAIR AGI Information (http://www.arabidopsis.org/home.html) and WormBase (http://www.wormbase.org) were used respectively. The analysis of protein domain structure and the search for structural/functional analogs was performed by the use of CDART:(Conserved Domain Architecture Retrieval Tool) (http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi?cmd=rps). And it was extended by prediction of physical-chemical properties of proteins, using ProtParam tool provided by ExPASy Molecular Biology Server (Expert Protein Analysis System) available at (http://www.expasy.ch/tools/proprotparam.html), and prediction of the protein secondary structure (ISREC) provided by BCM Search Launcher: Protein Secondary Structure Prediction (http://dot.imgen.bcm.tmc.edu:9331/seq-search/struc-predict.html/). Results and Discussion Using the experimental data by Dong & Roeder (2000), we found that, in zip1 mutants of S.scerevisiae having different deletions from the central domain of Zip1p, the TF length and central space width are well correlated with the length of the Zip1p normal either partially deleted alpha-helix (r=0.97, P<0.001). Along with certain protein features (the domain organization, the deduced conformation of the central domain, etc.), the correlation was used as a criterion to search for analogous proteins of D. melanogaster and other organisms. In fact, we sought the genes that potentially code for candidate SC TF proteins of these organisms. The c(3)G mutation in D. melanogaster causes the same ultrastructural alterations in SC as zip1 does in S. cerevisiae. Hence the virtual protein product of c(3)G+ might be a good candidate for a D. melanogaster TF protein. We analyzed the virtual protein products of 78 D. melanogaster genes from the region covering the c(3)G locus (250 kb in section 88E-89B of chromosome 3R). The genes have been annotated by Celera Genomics Inc. (NCBI database). We found only one gene, CG17604, whose virtual protein product was similar to Zip1p and SCP1 by all the criteria used. The length of its alphahelical region proved to correspond to the central space width in D. melanogaster. We identified the gene CG17604 as the gene c(3)G). Simultaneously, Page & Hawley (2001) successively used a construct of c(3)G+ and the gene of green fluorescence protein to transform mutant c(3)G flies and demonstrated localization of C(3)G protein within the synaptic space of pachytene bivalebnt. Thus, our strategy of searching for a D. melanogaster TF protein proved to be justifiable. As soon as for A. thaliana and C. elegans mutations affecting TF are unknown, the entire genome must be searched. Therefore, in the A. thaliana genome, we sought genes that code for proteins similar to Zip1p and SCP1 in domain structure and in length of the alpha-helix in the central domain. Then, the other criteria of protein similarity to Zip1p and SCP1 were employed. We found only one annotated A. thaliana gene coding for a protein (AAD 10695) with necessary features (Table). The C. elegans genome contains several such genes (according to the information presented in the WormBase and Proteome, Inc. and to our results). On evidence of in silico analysis of the structure and putative properties, we chose two proteins, Q11102 and Z81586 (Table), which are potentially able to form TF according to two structural models of SC in C. elegans. Table. Characteristics of experimentally studied and deduced (*) proteins and of SC parameters. Biological species and SC proteins M. musculus SCP1 S. cerevisiae Zip1p Protein (domain) size (amino-acid residues) whole Alpha-helix molecule 993 713 875 632 SC central space width (nm) Isoelectric points (рI) 100 N-terminal domain 5,9 115 4,8 5,3 C-terminal domain 9,7 whole molecule 5,8 6,1 10,1 6,4 central domain D. melanogaster CG17604 * 744 495 109 10,0 4,9 9,7 5,9 A. thaliana AAD10695 * C. elegans Q11102* 991 476 100-120 5,3 5,4 9,0 5,6 1132 938 70-85 11,9 5,1 11,0 5,5 C. elegans Z81586* 484 460 70-85 4,9 9,5 10,0 9,4 Thus, our strategy allows in silico identification of structural proteins that fit the physical parameters and biological properties of subcellular entities with a strongly specified spatial organization. The strategy is best applicable to organisms with known mutations affecting these subcellular structures. When such mutations are unknown, the entire genome must be searched. 85 BGRS’ 2002 Acknowledgements This work was supported by the Russian Foundation for Basic Research (project № 99-04-48182, and 02-04-48761). References 1. Bogdanov Yu.F., Grishaeva T.M., Dadashev S.Ya. (2002a) Gene CG17604 of Drosophila melanogaster may be a functional homolog of yeast gene ZIP1 and mammalian gene SCP1 (SYCP1) encoding proteins of the synaptonemal complex. Russ. J. Genet. 38, 90-94. 2. Bogdanov Yu.F., Dadashev S.Ya., Grishaeva T.M. (2002b) Comparative genomics and proteomics of Drosophila, Brenner's Nematode, and Arabidopsis. Identification of functionally similar synaptic genes and proteins. Russ. J. Genet. 38, (№ 8, in press). 3. Heyting C. (1996). Synaptonemal complex: structure and function. Curr. Opin. Cell Biol. 8, 389-396. 4. Page S.L., Hawley R.S. (2001) c(3)G encodes a Drosophila synaptonemal complex protein. Genes Dev. 15, 3130-3143. 5. Dong H., Roeder G.S. (2000).Organization of the yeast Zip1 protein within the central region of the synaptonemal complex. J. Cell Biol. 148, 417-426. 6. Zickler D., Kleckner N. (1999) Meiotic chromosomes: integrating structure and function. Annu. Rev. Genet. 33, 663-754. 86