Download Sequence Similarities of EST Clusters

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene expression programming wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genetic code wikipedia , lookup

Public health genomics wikipedia , lookup

Transposable element wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Gene desert wikipedia , lookup

Essential gene wikipedia , lookup

Non-coding DNA wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Genomic imprinting wikipedia , lookup

Genomics wikipedia , lookup

Human genome wikipedia , lookup

Genome (book) wikipedia , lookup

Designer baby wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Point mutation wikipedia , lookup

Microevolution wikipedia , lookup

Pathogenomics wikipedia , lookup

Genome editing wikipedia , lookup

Gene wikipedia , lookup

RNA-Seq wikipedia , lookup

Minimal genome wikipedia , lookup

Metagenomics wikipedia , lookup

Genome evolution wikipedia , lookup

Gene expression profiling wikipedia , lookup

Ridge (biology) wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
1
Supplementary Materials
2
3
4
Sequence Similarities Identified in the Parasite ESTs
To provide a first overview of gene identities in the two parasites, all available
5
EST clusters were translated and queried against three phylogenetically specific sequence
6
groups covering all the currently available coding sequences in public databases (Figure
7
S1). In total, 53% (A. suum) and 75% (H. contortus) of EST clusters contained
8
similarities to known genes in other organisms. The higher percentage for H. contortus
9
was likely due to the fact that most nematode sequences currently available are generated
10
from clade V species. The remaining clusters without similarities (47% and 25%,
11
respectively) include novel genes that are either lineage- or species-specific.
12
Distributions of the identified sequence similarities to the different sequence
13
groups were nearly identical in the two parasites (Figure S1; Figure S2). About 60% of
14
all homologous EST clusters (i.e. EST clusters found similar to known sequences) had
15
putative homologs in all the three sequence groups, implying that they are likely to be
16
involved in common molecular and cellular processes conserved across metazoans. In
17
contrast, 14-15% of the homologous EST clusters were found to contain similarities in
18
coding sequences restricted to Caenorhabditis spp. and other nematodes, making them
19
candidates for nematode-specific genes. Furthermore, small subsets of genes (31 A. suum
20
and 5 H. contortus EST clusters) showed similarities restricted to non-nematode coding
21
sequences, suggesting either species-specific gene acquisition in their genomes, gene-loss
22
events or accelerated changes in other nematodes, or contaminations of host genes.
23
Incomplete genomes and lack of representations for many nematode species could also
1
1
contribute to this. This group had matches to enzymes such as cobyric acid synthase
2
(AS15280.cl) and alpha-mannosidase (AS16547.cl), amino acid transporter
3
(AS09071.cl), and ion transport protein (AS11163.cl). Finally, 21% or 23% A. suum and
4
H. contortus EST clusters, respectively, were similar only to non-Caenorhabditis
5
nematode sequences, the majority of which (> 90%) originated from parasitic nematodes.
6
In fact, among the genes of this category, only 54 A. suum and 24 H. contortus EST
7
clusters showed similarities to any of the ~14,000 EST sequences from Pristionchus
8
pacificus and Zeldia punctata, the only other free-living nematodes with sequence
9
information available (data not shown). Therefore, this group is interesting because it
10
may contain broadly conserved genes that are important to parasitism.
11
12
Gene Ontology Analysis on the IntFam Groups Other Than IntFam-241
13
As for the IntFam-241 group, statistically enriched gene ontologies were
14
identified for non-Intfam-241 groups (Table S6). These ontologies may include protein
15
functions specific to different nematode lineages and species. However, the protein
16
families were built on partial transcriptomes, some of them may become new members of
17
the conserved “core” intestinal transcriptome containing homologous genes from all the
18
three nematodes, when more intestinal sequences become available. This makes it
19
difficult to identify the true lineage- or species- specific characteristics. For example,
20
IntFam-47 (the 47 protein families containing sequences from only A. suum and H.
21
contortus; Figure 4) had 37 genes identified as electron transporters that are likely
22
involved in energy generation (GO:0006118) (Table S6). All of them were annotated
23
according to their strong similarities to essential members of the canonical electron
2
1
transport-coupled ATP synthesis, such as the NADH dehydrogenase subunit I or the
2
cytochrome C oxidase subunit III (data not shown). Even though these IntFam-47
3
families do not currently contain C. elegans intestinal members, it is difficult to imagine
4
the lack of those energy generation components in C. elegans intestinal cells. In fact,
5
orthologous genes for all of them have been identified and annotated in the C. elegans
6
genome (www.wormbase.org), suggesting this result was probably caused by the
7
incompleteness of the intestinal transcriptome in C. elegans.
8
9
Supporting Figure Legends
10
11
Figure S1. Distribution of Sequence Similarities Identified in A. suum and H. contortus
12
EST Clusters. The three phylogenetically specific sequence groups used to identify
13
sequence similarities of the intestinal genes were: i) Caenorhabditis spp., amino acid
14
sequences from the complete genomes of C. elegans, C. briggsae, and C. remanei, ii)
15
Other Nematoda, non-Caenorhabditis nematode nucleic acid sequences excluding those
16
from either A. suum or H. contortus, when sequences from A. suum or H. contortus were
17
queried, respectively, and iii) Non-Nematoda, non-nematode amino acid sequences from
18
the non-redundant protein database NR. In total, 53% (5,303/9,947) A. suum and 75%
19
(3,792/5,058) H. contortus EST clusters contained primary sequence similarities to
20
known genes from other species, but similar distributions of the identified matches to
21
various species groups were observed in the two parasites.
22
3
1
Figure S2. Homologous Pairs between the Intestine and Gonad Gene Groups from A.
2
suum and C. elegans. Significant larger number of genes in the A. suum intestine group
3
had homologous counterparts in the C. elegans intestine group than in the C. elegans
4
gonad group at BLAST bit-score cutoff of either 50 or 100, indicating the intestinal
5
expression of homologous genes tend to be maintained across nematodes. However, the
6
number of homologous pairs detected between the two gonad groups was not different
7
from that between the gonad and intestine groups.
4