Download Finding the wheat homologues of genes from model organisms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Point mutation wikipedia , lookup

Metagenomics wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

X-inactivation wikipedia , lookup

Ridge (biology) wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

NEDD9 wikipedia , lookup

Genetically modified organism containment and escape wikipedia , lookup

Genetically modified crops wikipedia , lookup

Minimal genome wikipedia , lookup

Public health genomics wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Genomic imprinting wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Copy-number variation wikipedia , lookup

Polyploid wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Pathogenomics wikipedia , lookup

Genetic engineering wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene therapy wikipedia , lookup

Gene wikipedia , lookup

Genome editing wikipedia , lookup

The Selfish Gene wikipedia , lookup

Gene desert wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Genome (book) wikipedia , lookup

Gene expression profiling wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gene nomenclature wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene expression programming wikipedia , lookup

Genome evolution wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Finding the wheat homologues of genes from
model organisms
This tutorial document introduces you to how to find the wheat homologues of genes from other
model or non-model plant species. This is especially usefully in translational research or comparative
genomic studies where the gene of interest might have been well characterised in a model species
(like Arabidopsis) and it is desired to study and characterise the function of wheat homologues of
such genes.
a) Important Considerations
It is important to note that the genetic control of traits can vary in plant species. As such, the genetic
architecture underpinning traits in model species might not be representative of other plant species.
This implies that genes found in model species might not be present in wheat and vice versa. Also,
due to gene loss or duplication events, gene family size and/or gene copy number can also vary
between species.
It is also important to bear in mind the ploidy level and homoeologous relationship of the wheat
specie you are interested in (see “Introduction to Wheat Growth”). For instance, for a single
Arabidopsis gene, you should expect to find one, two or three gene models in the diploid (2n;
einkorn), tetraploid (2x 2n: durum wheat) or hexaploid (3 x 2n; bread) wheat, respectively. In this
tutorial, however, we will focus on hexaploid wheat.
b) Finding your wheat homologue through Ensembl
Plants
Ensembl Plants hosts the genomes of most sequenced model and non-model plant species. This
makes Ensembl Plants a convenient portal to compare and inter-connect between different plant
genomes (see “Ensembl plants primer” for a quick introduction on how to use Ensembl Plants). We
will be using the gene tree and orthologue features of Ensembl. However, other genome database
portal (e.g URGI, CerealsDB, Phytozome, or even NCBI) can be used in addition to databases
specific to your model plant (e.g TAIR for Arabidopsis).
1. To get started, visit the Ensembl Plant website at http://plants.ensembl.org. Figure 1 below
shows the homepage of the Ensembl Plant website which has many useful features for
genomic analysis
a. NB. In order to identify TILLING mutants in your gene of interest, you will need to
obtain the IWGSC CSS gene ID or scaffold (see “Selecting TILLING Mutants”). To
obtain this you will need to use the archive site of Ensembl Plants
(http://archive.plants.ensembl.org/index.html).
2. You will need to first access the Ensembl page of your query gene of interest (GOI). There
are two ways to do this:
Finding the wheat homologues of genes from model organisms
www.wheat-training.com
1
a. For well characterised genes with designated names and/or chromosome coordinate, use the search option (see box 1 in Figure 1) by selecting the species of
your GOI and entering the gene name or chromosome co-ordinate. Alternatively, you
can first select the species of your GOI (e.g Arabidopsis) from the list of popular
genomes or from the genome dropdown list (box 2 in Figure 1), and then search for
the gene by name or chromosome position. Select the Gene ID link from the search
result to go to the gene summary page.
1
2
Figure 1. The Ensembl Plant home page.
Ensembl Plant is a one–stop portal to access genomic information for most
plant species with sequenced genomes.
b. For species with less annotated genomes, blasting (performing a BLAST search) the
sequence of your GOI is a good entry route. To do this, use the BLAST option on the
Ensembl Plant main menu bar to access the BLAST search page (Fig 2; See
“Ensembl plants primer” for information on BLAST search in Ensembl). Enter the
amino acid or nucleotide sequences of GOI and search against the DNA or protein
databases of your query species. From the BLAST result, select the Gene ID link of
the hit that best matches your GOI to access Ensembl plant page of your query gene.
Finding the wheat homologues of genes from model organisms
www.wheat-training.com
2
a
b
c
d
Figure 2: BLAST Page on Ensembl.
The Ensembl BLAST page is similar to any BLAST page with options to
enter or upload query sequence (a), select database (b) and BLAST
algorithm (c) and adjust for BLAST stringency (d).
3. The gene summary page contains four view tabs including the “Genome”, “Location”, “Gene”
and “Transcript” (Fig 3). If not already under the “Gene” tab, select the gene tab to explore
gene-based features of your gene.
a
b
Figure 3: Ensembl Gene Summary Page.
(a)The Gene summary page in Ensembl has four view tabs just below the menu tar and this
can be used to view different features of the GOI. (b) The “Plant Compara” tool set in the Gene
based display allows for comparative analyses.
Finding the wheat homologues of genes from model organisms
www.wheat-training.com
3
4. The “Plant Compara” tools in the gene-based display box (b in Figure 4), contains tools for
comparative genomic analysis between and within plant species. These includes tools for
genomic alignment, phylogenetic analyses (Gene Tree), paralogues search within species
(Paralogues); and orthologue search between species (Orthologues). We will be using the
Gene Tree and Orthologue tools for our search of wheat homologues.
Using the Gene Tree
The Gene Tree displays a phylogeny tree of the homologue of any goi across different plant species.
To use the Gene Tree tool follow the steps below:
1. Click on the Gene Tree link under the Plant Compara tools on the summary page of your
GOI to access the gene tree (Figure 4).
The Gene Tree view is composed of a phylogenetic tree and a schematic view of the protein
alignment of homologous genes. The alignment is particular useful to comparing
conservation of the gene structure across species. Your gene of interest is highlighted in
red.
Figure 4: the Gene Tree view on Ensembl Plant
The gene tree presents a comparative view of homologues across different plant species
and it features a phylogenetic tree and a protein alignment. The phylogenetic tree is
comprised of branches and nodes which could either represent speciation (blue),
duplication (red), ambiguous duplication (cyan) events or terminate in a gene. The triangles
on the gene tree represent collapsed portion of the tree (subtree). For more summary
description of the elements of the tree and alignment, see the legend at the bottom of the
pages.
Finding the wheat homologues of genes from model organisms
www.wheat-training.com
4
2. Starting from the gene node of your GOI, move gradually outward towards the base of the
tree and expand the subtree (triangles) nodes by clicking on the node and selecting
“expanding subtree”
The subtrees are structured based on taxonomic rank. Wheat genes would be expected to
be in the Triticeae, Poaceae, Pooideae, Commelinids, Magnoliophyta and Embryophyta
nodes. Alternatively, you can expand the whole tree, but this will make it more difficult to
trace the wheat genes closest to your GOI.
3. As you expand the subtree above, identify the Triticum aestivum gene closest to your GOI
(highlighted box in Figure 5).
For query genes with homologues in bread wheat, the tree should contain gene nodes
corresponding to bread wheat homologue on the A, B and D genomes. However, please
note that due to gene copy number difference between species (discussed previously), you
might find more or less wheat homologues than you might expect.
Figure 5: Finding homologous wheat gene using the Gene Tree
The tree shows phylogenetic relationship between the gene of interest (highlighted in red) and its
homologous wheat genes (in red box).
Finding the wheat homologues of genes from model organisms
www.wheat-training.com
5
4.
Click on the gene nodes of the identified wheat homologues and select the gene ID link on
the pop-up box to get to the Ensembl summary page of your identified wheat homologue.
5. To further confirm the homology of the identified wheat genes, you can perform a reciprocal
BLAST against the genome of your GOI using the identified wheat homologues genes as
query. For genes with true homologous relationship, your BLAST results should have your
GOI to be the most significant hit.
Using the orthologue link
The orthologues link under the Plant Compara tool set is an alternative route to find homologous
wheat gene for your GOI. It presents a list of all the homologues of any particular gene in other
species. The advantage of using the orthologue link is that it reduces the view complexity that could
be associated with the gene tree. Note that the list presented in the orthologue view is extracted from
the gene tree data. Follow the steps below to find to use the orthologue feature of Ensembl.
1. Click on the “Orthologue” link under the Plant Compara tool of the gene-based display.
This will present a top table in which species with homology to your GOI are grouped into
clades and also a list of all the homologues identified across plant species. The orthologues
list are arranged under the following fields:
Species: Plants with gene homologous to your gene of interest
Type: This could be a

1-to-1 orthologues: only one copy is found in each species

1-to-many orthologues: one gene in one species is orthologous to multiple genes in
another species

Many-to-many orthologues: multiple orthologues are found in both species
DN/DS: A measure of the selection pressure on a gene
Ensembl ID & Gene Name: The Ensembl ID contains direct link to gene summary page of
the homologues
Compare: This contains links for one-to-one comparison (alignment and phylogeny) between
each homologue and the gene of interest
Location: The chromosome location of the homologues. Care should be taken with this
chromosome position of wheat homologous found. As at the time of writing this tutorial, this
chromosome position does not represent the actually physical position of gene but an
estimated position based on genetic mapping information.
Target % ID: Percentage of the homologous gene matching the query gene of interest
Query % ID: Percentage of query gene of interest sequence matching the homologous gene
sequence.
2. Scroll down the list until you find Triticum aestivum genes.
Finding the wheat homologues of genes from model organisms
www.wheat-training.com
6
3. In cases where many Triticum aestivum genes are found, use the type (1-to-1, 1-to-many or
many-to-many), Target % ID and Query % ID information, as well as the compare tool to
determine the wheat gene with closest homology to your gene of interest. The true wheat
homologue should have the highest Target % ID and Query % ID as well as be closest to
your goi on the phylogenetic tree.
4. As with using the gene tree, considering doing a reciprocal BLAST against the genome of
your gene of interest using the identified wheat homologues genes as query.
c) Finding homologous wheat gene for Genomes not
hosted on Ensembl Plants
Finding homologous wheat gene for species whose genomes are not hosted on Ensembl might not
be as straight forward as described above. In such instances, an entry route to using Ensembl for
such analysis might be to BLAST the sequences of your GOI against all the genome databases
available on Ensembl. This will enable you to find the closest “intermediate” homologue which you
can subsequently use as query to find the wheat homologue using the steps described in the sections
above.
Finding the wheat homologues of genes from model organisms
www.wheat-training.com
7