Download development of a strategy for computer

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene nomenclature wikipedia , lookup

Genome (book) wikipedia , lookup

Gene wikipedia , lookup

Point mutation wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Genome evolution wikipedia , lookup

Gene expression profiling wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Minimal genome wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

NEDD9 wikipedia , lookup

Protein moonlighting wikipedia , lookup

Transcript
COMPUTATIONAL PROTEOMICS
DEVELOPMENT OF A STRATEGY FOR COMPUTER-ASSISTED
SEARCHING
FOR FUNCTIONALLY SIMILAR PROTEINS IN
EVOLUTIONARILY DISTANT ORGANISMS
* Bogdanov Yu.F., Dadashev S.Ya., Grishaeva T.M.
Vavilov Institute of General Genetics, RAS, Moscow, e-mail: [email protected]
*Corresponding author
Key words: databases, knowledge bases, computer analysis, functional proteomics, virtual cell
Abstract
Motivation: Synaptonemal complex (SC), an universal ultrastructure that ensures the successful pairing and recombination
of homologous chromosomes during meiosis in evolutionarily distant organisms, is build of non-homologous proteins. We
aimed on developing a method of searching databases for genes that code for such non-homologous but functionally
analogous proteins.
Results: Advantage was taken of the ultrastructural parameters of SC and the conformation of SC proteins responsible for
these. Using data from literature, we found a highly significant correlation (r=0.97; P< 0.001) between the width of the SC
central space and the length of the alpha-helix in the central domain of yeast normal and deleted Zip1p and mammalian
SCP1, intermediate proteins that form transversal filaments in SC central space. Basing on this, we found the Drosophila
melanogaster CG17604 gene whose virtual protein meets the correlation requirement. Our finding has received
experimental support in another lab. With the same strategy, we showed that the Arabidopsis thaliana and Caenorhabditis
elegans genomes contain unique genes coding for proteins that also fit the above requirements.
Availability: Bogdanov et al., 2002a, b.
Introduction
Genome databases have accumulated and processed the data on the complete genome sequences of model eukaryotic
organisms: yeast S. cerevisiae, nematode C. elegans, fruit fly D. melanogaster, and plant A. thaliana. These organisms have
been found to possess several hundred of orthologous genes and proteins, which are similar in primary structure and play a
common role. However, evolutionarily distant organisms have such organelles as kinetochores, cell centers, synaptonemal
complexes (SCs), etc., which partly or completely differ in ultrastructure notwithstanding their common function. In many
cases, these structures are build of different structural proteins. To search for such functional analogs, we developed a
strategy that combines computer analysis of the conformation and other physical-chemical properties of proteins with
ultrastructural parameters of cellular organelles in situ obtained by electron microscopy.
The transversal filaments (TFs) in the central space of the SC are responsible for chromosome synapsis (Zickler, Kleckner,
1999). Mammalian and yeast proteins that form TFs, SCP1 and Zip1p respectively, have been isolated and studied
(Heyting, 1996; Dong, Roeder, 2000). These proteins, being non-homologous, are similarly organized and include three
domains, with the central one possessing an extended alpha-helix. Both proteins are classed with intermediate proteins. In
vitro, each protein forms road-shaped dimers of two similarly oriented parallel molecules (Heyting, 1996). The dimers
resemble tooth-like halves of zipper-like connections in SC central space, i.e. TFs (Fig.). In addition, SCP1 and Zip1p share
other physical-chemical properties of the entire molecule and of its individual domains. Their analogy can be extended to
the ultrastructural level, since SC is structurally similar in yeast and in mammals, central space width, being about 100 nm.
Basing on these data, we carried out a computer search for proteins forming TFs in D. melanogaster, A. thaliana, and C.
elegans.
Fig. Scheme of synaptonemal complex in mammals .and yeast
Ch - Chromatin loops; LE - lateral element; TF - transversal
filament; C and N -- terminal domains of SCP1 or Zip1p,
respectively; CE - central element; doted line -- crossover DNA.
84
BGRS’ 2002
Method
As resources, we used databases on the known and putative genes and proteins in S. cerevisiae, D. melanogaster,
A. thaliana, and C. elegans. provided by NCBI (http://www.ncbi.nlm.nih.gov/)..Additionally for A. thaliana and
С. elegans, databases of TAIR AGI Information (http://www.arabidopsis.org/home.html) and WormBase
(http://www.wormbase.org) were used respectively.
The analysis of protein domain structure and the search for structural/functional analogs was performed by the use of
CDART:(Conserved
Domain
Architecture
Retrieval
Tool)
(http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi?cmd=rps). And it was extended by prediction of physical-chemical
properties of proteins, using ProtParam tool provided by ExPASy Molecular Biology Server (Expert Protein Analysis
System) available at (http://www.expasy.ch/tools/proprotparam.html), and prediction of the protein secondary structure
(ISREC)
provided
by
BCM
Search
Launcher:
Protein
Secondary
Structure
Prediction
(http://dot.imgen.bcm.tmc.edu:9331/seq-search/struc-predict.html/).
Results and Discussion
Using the experimental data by Dong & Roeder (2000), we found that, in zip1 mutants of S.scerevisiae having different
deletions from the central domain of Zip1p, the TF length and central space width are well correlated with the length of the
Zip1p normal either partially deleted alpha-helix (r=0.97, P<0.001). Along with certain protein features (the domain
organization, the deduced conformation of the central domain, etc.), the correlation was used as a criterion to search for
analogous proteins of D. melanogaster and other organisms. In fact, we sought the genes that potentially code for candidate
SC TF proteins of these organisms.
The c(3)G mutation in D. melanogaster causes the same ultrastructural alterations in SC as zip1 does in S. cerevisiae. Hence
the virtual protein product of c(3)G+ might be a good candidate for a D. melanogaster TF protein. We analyzed the virtual
protein products of 78 D. melanogaster genes from the region covering the c(3)G locus (250 kb in section 88E-89B of
chromosome 3R). The genes have been annotated by Celera Genomics Inc. (NCBI database). We found only one gene,
CG17604, whose virtual protein product was similar to Zip1p and SCP1 by all the criteria used. The length of its alphahelical region proved to correspond to the central space width in D. melanogaster. We identified the gene CG17604 as the
gene c(3)G). Simultaneously, Page & Hawley (2001) successively used a construct of c(3)G+ and the gene of green
fluorescence protein to transform mutant c(3)G flies and demonstrated localization of C(3)G protein within the synaptic
space of pachytene bivalebnt. Thus, our strategy of searching for a D. melanogaster TF protein proved to be justifiable.
As soon as for A. thaliana and C. elegans mutations affecting TF are unknown, the entire genome must be searched.
Therefore, in the A. thaliana genome, we sought genes that code for proteins similar to Zip1p and SCP1 in domain structure
and in length of the alpha-helix in the central domain. Then, the other criteria of protein similarity to Zip1p and SCP1 were
employed. We found only one annotated A. thaliana gene coding for a protein (AAD 10695) with necessary features
(Table). The C. elegans genome contains several such genes (according to the information presented in the WormBase and
Proteome, Inc. and to our results). On evidence of in silico analysis of the structure and putative properties, we chose two
proteins, Q11102 and Z81586 (Table), which are potentially able to form TF according to two structural models of SC in C.
elegans.
Table. Characteristics of experimentally studied and deduced (*) proteins and of SC parameters.
Biological species and SC
proteins
M. musculus SCP1
S. cerevisiae Zip1p
Protein (domain) size
(amino-acid residues)
whole
Alpha-helix
molecule
993
713
875
632
SC central space
width (nm)
Isoelectric points (рI)
100
N-terminal
domain
5,9
115
4,8
5,3
C-terminal
domain
9,7
whole molecule
5,8
6,1
10,1
6,4
central domain
D. melanogaster CG17604 *
744
495
109
10,0
4,9
9,7
5,9
A. thaliana
AAD10695 *
C. elegans Q11102*
991
476
100-120
5,3
5,4
9,0
5,6
1132
938
70-85
11,9
5,1
11,0
5,5
C. elegans Z81586*
484
460
70-85
4,9
9,5
10,0
9,4
Thus, our strategy allows in silico identification of structural proteins that fit the physical parameters and biological
properties of subcellular entities with a strongly specified spatial organization. The strategy is best applicable to organisms
with known mutations affecting these subcellular structures. When such mutations are unknown, the entire genome must be
searched.
85
BGRS’ 2002
Acknowledgements
This work was supported by the Russian Foundation for Basic Research (project № 99-04-48182, and 02-04-48761).
References
1. Bogdanov Yu.F., Grishaeva T.M., Dadashev S.Ya. (2002a) Gene CG17604 of Drosophila melanogaster may be a functional homolog
of yeast gene ZIP1 and mammalian gene SCP1 (SYCP1) encoding proteins of the synaptonemal complex. Russ. J. Genet. 38, 90-94.
2. Bogdanov Yu.F., Dadashev S.Ya., Grishaeva T.M. (2002b) Comparative genomics and proteomics of Drosophila, Brenner's
Nematode, and Arabidopsis. Identification of functionally similar synaptic genes and proteins. Russ. J. Genet. 38, (№ 8, in press).
3. Heyting C. (1996). Synaptonemal complex: structure and function. Curr. Opin. Cell Biol. 8, 389-396.
4. Page S.L., Hawley R.S. (2001) c(3)G encodes a Drosophila synaptonemal complex protein. Genes Dev. 15, 3130-3143.
5. Dong H., Roeder G.S. (2000).Organization of the yeast Zip1 protein within the central region of the synaptonemal complex. J. Cell
Biol. 148, 417-426.
6. Zickler D., Kleckner N. (1999) Meiotic chromosomes: integrating structure and function. Annu. Rev. Genet. 33, 663-754.
86