Download Comparative Genomics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of genetic engineering wikipedia , lookup

Human genetic variation wikipedia , lookup

Genetic engineering wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

RNA-Seq wikipedia , lookup

Metagenomics wikipedia , lookup

Gene therapy wikipedia , lookup

Copy-number variation wikipedia , lookup

Non-coding DNA wikipedia , lookup

Gene expression profiling wikipedia , lookup

Gene expression programming wikipedia , lookup

Gene wikipedia , lookup

Public health genomics wikipedia , lookup

Genome (book) wikipedia , lookup

Protein moonlighting wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Human genome wikipedia , lookup

Gene desert wikipedia , lookup

NEDD9 wikipedia , lookup

Pathogenomics wikipedia , lookup

Point mutation wikipedia , lookup

Genome editing wikipedia , lookup

Gene nomenclature wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Multiple sequence alignment wikipedia , lookup

Genomics wikipedia , lookup

Designer baby wikipedia , lookup

Microevolution wikipedia , lookup

Sequence alignment wikipedia , lookup

Genome evolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
Comparative Genomics
Overview of the Talk
• Comparing Genomes
• Homologies & Families
• Sequence Alignments
2/28
Comparative genomics
Protein trees
Protein sequence
(Compara & Pan-Compara)
Homologues
Genome alignments
Nucleotide sequence
(Compara)
Syntenic regions
3/28
How Does Ensembl Predict
Homology?
• Uses all the species
• Uses a representative protein (the longest)
for every gene
• Builds a gene tree
•
EnsemblCompara GeneTrees: Analysis of complete, duplication aware
phylogenetic trees in vertebrates. Vilella AJ, Severin J, Ureta-Vidal A,
Durbin R, Heng L, Birney E. Genome Res. 2008 Nov 24.
4/28
Steps in Homology Prediction
..MEDPATA…
Load longest protein for every
gene from all species
WU Blastp + SmithWaterman
longest translation of every gene
against every other
(Blast Reciprocal Hit/ Blast Score Ratio)
Protein clustering, build multiple
alignments (MCoffee)
From each alignment,
build a gene tree (TreeBest)
Reconcile each gene tree with the
species tree to determine internal
nodes (TreeBest)
Orthologues, paralogues…
5/28
Viewing Trees in Ensembl
6/28
Types of Homologues
• Orthologues : any gene pairwise relation
where the ancestor node is a speciation event
• Paralogues : any gene pairwise relation where
the ancestor node is a duplication event
7/28
The Gene Tree for INS
(insulin precursor)
A blue square
is a
speciation
event
(Orthologues)
A red square
is a
duplication
event
(Paralogues)
8/28
Orthologue Types
What is „1 to 1‟?
What is „1 to many‟?
9/28
Quick exercise
MYO6 is a myosin that has been shown (when
mutated) to be associated with deafness.
1. Does human MYO6 have a homologue in
dog?
2. If so, in what location (chromosome and base
pairs) is the dog homologue found?
3. Can you find the cDNA alignment between the
human and dog homologues?
Pan-taxonomic compara
Anolis carolinensis
Ciona savignyi
Danio rerio
Equus caballus
Gallus gallus
Homo sapiens
Macaca mulatta
Monodelphis domestica
Mus musculus
Ornithorhynchus anatinus
Pan troglodytes
Pongo pygmaeus
Xenopus tropicalis
Anopheles gambiae
Caenorhabditis elegans
Drosophila melanogaster
Dictyostelium discoideum
Plasmodium falciparum
Plasmodium vivax
Arabidopsis thaliana
Oryza sativa
Vitis vinifera
B_aphidicola_Tokyo_1998
B_burgdorferi_DSM_4680
B_subtilis
E_coli_K12
M_tuberculosis_H37Rv
N_meningitidis_A
P_horikoshii
S_aureus_N315
S_pneumoniae_TIGR4
S_pyogenes_SF370
W_pipientis_wMel
Aspergillus nidulans
Neurospora crassa
Saccharomyces cerevisiae
Schizosaccharomyces pombe
11/28
www.ensemblgenomes.org
12/28
Protein Families
• How: Cluster proteins for every isoform
in every species + UniProt proteins.
• BLASTP comparison of:
– all Ensembl ENSP…
– all metazoan (animal) proteins in UniProt
13/28
Overview of the Talk
• Comparing Genomes
• Homologies and Families
• Sequence Alignments
14/28
Comparative genomics
Genome Alignments
Nucleotide sequence
(Compara)
Syntenic regions
Protein trees
Protein sequence
(Compara & Pan-Compara)
Homologues
15/28
Whole Genome Alignments
• BLASTZ-net (nucleotide level)
closer species e.g. human – mouse
• Translated BLAT (amino acid level)
more distant species, e.g. human – zebrafish
• EPO/PECAN multispecies alignments
• ORTHEUS used to determine ancestral alleles
16/28
Quick exercise
Go to the „Genomic alignments‟ view for
MYO6. Turn on the alignment with dog.
1. What do you see? Why are there so many
regions for dog listed?
2. What are the red, highlighted basepairs?
3. Is there a dog gene in this region?
4. What multiple alignments are available for
human?
Which Multispecies Alignments?
Mercator-Pecan
• 19 amniota vertebrates + constrained elements
Enredo-Pecan-Ortheus (EPO)
• 3 neognath birds
• For 6 primates
• For 5 teleost fish + constrained elements
• For 12 eutherian mammals
• For 35 eutherian mammals + constrained elements
18/28
What are Constrained Elements?
GERP scoring of every nucleotide in the alignment
(Cooper GM et al., Genome Res., 2005; 15:901-913)
19/28
Quick exercise
Go to the Location tab for human MYO6
(region in detail view).
Turn on the
„conservation
score‟
and
„constrained
elements‟ for the 35 eutherian mammals.
1. Zoom in to one or a few exons. Do the
constrained elements match up to the exons?
Non-Coding Regions
• “Phylogenetic Footprinting” – conserved
noncoding regions can be functional
• Regulatory regions discovered in this way
for genes:
Hoxb-1, Hoxb4, PAX6, SOX9
21/28
Regulatory Features of the PDX1
gene
Region in Detail shows conservation of sequence in regions
involved in PDX1 transcriptional regulation
(1.6-2.8 kb upstream of the gene).
22/28
Comparative genomics
Pairwise alignments
Nucleotide sequence
(Compara)
Syntenic regions
Protein trees
Protein sequence
(Compara & Pan-Compara)
Homologues
23/28
Syntenic regions
Syntenic regions
Blastz/
Lastz
24/28
Synteny
25/28
Synteny exercise
Go to the Synteny view in the location tab.
1. How many chromosomes in dog have
syntenic regions to human chromosome 6?
2. Click „15 downstream genes‟. Are there
missing dog homologues to the human gene
list?
Advanced views
Go to the Synteny view in the location tab.
Explore the Alignments (image), Alignments
(text) and Multi-species view in the location
tab.
1. View alignments between human and dog in
these three views.
Acknowledgements
•
•
•
•
•
•
Javier Herrero
Kathryn Beal
Stephen Fitzgerald
Leo Gordon
Matthieu Muffato
Miguel Pignatelli
28/28