Download Supplemental Material I

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Oncogenomics wikipedia , lookup

Public health genomics wikipedia , lookup

Polyploid wikipedia , lookup

Point mutation wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Gene desert wikipedia , lookup

Neocentromere wikipedia , lookup

Y chromosome wikipedia , lookup

Metagenomics wikipedia , lookup

Essential gene wikipedia , lookup

Transposable element wikipedia , lookup

Genomics wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Non-coding DNA wikipedia , lookup

Human genome wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Pathogenomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

RNA-Seq wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genome editing wikipedia , lookup

X-inactivation wikipedia , lookup

Helitron (biology) wikipedia , lookup

Gene expression programming wikipedia , lookup

Ridge (biology) wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genomic imprinting wikipedia , lookup

Gene wikipedia , lookup

Microevolution wikipedia , lookup

Genome evolution wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression profiling wikipedia , lookup

Minimal genome wikipedia , lookup

Genomic library wikipedia , lookup

Genome (book) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Supplemental Text 1. Sequence annotation of 10 BAC clones from wheat chromosome
3B: Gene prediction, description and synteny with rice.
We conducted gene prediction analysis for the remaining 18.5% non-TEs and nonrepeated DNA, using different search programs (see Supplemental Method 1 for detailed
annotation method). Genes of known and unknown functions, or putative genes were
defined based on predictions and the existence of rice or other Triticeae homologs.
Hypothetical genes were identified based on prediction programs only. Pseudogenes
were not well predicted and frameshifts need to be introduced within the CDS structure to
better fit a putative function based on BLASTX (mainly with rice). Truncated
pseudogenes (genes disrupted by large insertion or deletion) and highly degenerated CDS
sequences were considered as gene-relics.
Combined together, all these types of gene sequence information (GSI) account for only
1.0% of the sequence and are present in seven BAC clones (one or two genes per clone)
while the remaining three BAC clones (TA3B95C9, TA3B95G2, TA3B63N2) contain no
genes (indicated in Figure 1A and detailed in Supplemental Text 1, Supplemental Table 3
and Supplemental Table 4).
Six genes (of known and unknown functions), and 2 putative genes were detected on 5 of
the BAC clones (indicated on Figure 1A and detailed in Supplemental Table 3): BAC
clone TA3B63B13 contains two genes of known functions, one of which was
incompletely sequenced (located on the end of the BAC clone), BAC clone TA3B81B7
one putative gene, BAC clone TA3B95F5 one putative and two other genes of unknown
functions, BAC clone TA3B63C11 one known gene and BAC clone TA3B63E4 one
incompletely sequenced gene of unknown function.
Charles et al. Supplemental_Text-1
1
In addition to genes (of known or unknown functions) and putative genes, the search for
sequence homologies between the whole 18.5% non-TE and non-repeated DNA
sequences and the rice genome sequence (http://www.tigr.org/tdb/e2k1/osa1/), allowed us
to detect several conserved sequences between wheat and rice. As summarized, one
pseudogene and four gene-relics detected in (respectively) the BAC clones TA3B54F7
(one pseudogene), TA3B63B7 (two gene-relics), TA3B81B7 (one gene-relic) and
TA3B63C11 (one gene-relic) (Supplemental Table 3), could not be predicted with the
CDS prediction program (FGENESH), as they show frameshifts, stop mutations, TE
insertions and/or large indels, and are probably no longer functional (Supplemental Table
2). Three of these five truncated genes (pseudogenes and gene-relics) have resulted from
TEs insertions (Supplemental Table 3).
The wheat chromosome 3B is homologous to the rice chromosome 1. For orthology and
synteny analysis, we considered the rice chromosome 1 and its duplicated segments that
are
found
on
other
chromosomes
(GUYOT
et
al.
2004
and
TIGR
site
http://www.tigr.org/tdb/e2k1/osa1/segmental_dup/). Three BAC clones (TA3B63B13,
TA3B81B7, TA3B95F5) have one or two of their orthologous rice genes that can be
mapped on the rice chromosome 1 and were considered as confirmed in their synteny
(Table 1). It is interesting to note that the two genes of known functions, separated by
88,114 bp on the BAC clone TA3B63B13 (Figure 1A) have their respective orthologs
separated by 22,816 bp on rice chromosome 1. Thus, for this intergenic region, there is
four-fold size difference between rice and wheat since their divergence from a common
ancestor. Three other BAC clones (TA3B54F7, TA3B63C11 and TA3B63E4) also have
homologs on rice chromosome 1, but the best match was observed with genes mapped on
Charles et al. Supplemental_Text-1
2
other rice chromosomes (Supplemental Table 3). BAC clone TA3B63B7 shows, for its
putative gene and pseudogene, homologies with rice genes located on rice chromosome
other than chromosome 1 (Supplemental Table 3).
No GSI or orthologous rice regions could be assigned to the three remaining BAC clones
(TA3B95C9, TA3B95G2, TA3B63N2).
Finally 10 hypothetical genes were identified based on gene prediction only in the BAC
clones TA3B54F7 (one), TA3B63B13 (two), TA3B81B7 (one), TA3B95F5 (four),
TA3B63C11 (two) (Supplemental Table 4).
Sources:
GUYOT, R., and B. KELLER, 2004
610–614.
Ancestral genome duplication in rice. Genome 47:
JURKA, J., 2000 Repbase update: a database and an electronic journal of repetitive
elements. Trends Genet. 16: 418–420.
JURKA, J., P. KLONOWSKI, V. DAGMAN and P. PELTON, 1996 CENSOR: a program for
identification and elimination of repetitive elements from DNA sequences. Comput.
Chem. 20: 119–121.
MCCARTHY, E. M., and J. F. MCDONALD, 2003 LTR_STRUC: a novel search and
identification program for LTR retrotransposons. Bioinformatics 19: 362–367.
SONNHAMMER, E. L., and R. DURBIN, 1995 A dot-matrix program with dynamic
threshold control suited for genomic DNA and protein sequence analysis. Gene 167:
GC1–GC10.
Charles et al. Supplemental_Text-1
3