Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Non-coding DNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Community fingerprinting wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Molecular evolution wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Summer Bioinformatics Workshop 2008
Comparative Genomics and
Phylogenetics
Chi-Cheng Lin, Ph.D., Professor
Department of Computer Science
Winona State University – Rochester Center
[email protected]
Summer Bioinformatics Workshop 2008
Outline
•
•
•
•
•
Comparative Genomics
Phylogenetics
Phylogenetic Tree
Phylgenetics Applications
Gene Tree vs. Species Tree
2
Summer Bioinformatics Workshop 2008
Comparative Genomics
• Analysis and comparison of genomes from
different species
• Purposes
– to gain a better understanding of how species have
evolved
– to determine the function of genes and non-coding
regions of the genome
• The functions of human genes and other DNA
regions often are revealed by studying their
parallels in nonhumans.
– Researchers have learned a great deal about the
function of human genes by examining their
counterparts in simpler model organisms such as the
mouse.
3
Summer Bioinformatics Workshop 2008
Comparative Genomics
• Features looked at when comparing genomes:
–
–
–
–
–
sequence similarity
gene location
length and number of coding regions within genes
amount of non-coding DNA in each genome
highly conserved regions maintained in organisms
• Computer programs that can line up multiple
genomes and look for regions of similarity
among them are used.
• Many of these sequence-similarity tools, such as
BLAST, are accessible to the public over the
Internet.
4
Summer Bioinformatics Workshop 2008
Of Mice and Men
• The full complement of human
chromosomes can be cut into about
150 pieces, then reassembled into a
reasonable approximation of the
mouse genome.
• The colors of the mouse
chromosomes and the numbers
alongside indicate the human
chromosomes containing
homologous segments.
• This piecewise similarity between the
mouse and human genomes means
that insights into mouse genetics are
likely to illuminate human genetics
as well.
5
Source: http://www.ornl.gov/sci/techresources/Human_Genome/publicat/tko/06_img.html
Summer Bioinformatics Workshop 2008
Phylogenetics
• Phylogenetics
– Study of evolutionary relationships
(sequences / species)
– Infer evolutionary relationship from
shared features
• Phylogeny
– Relationship between organisms
with common ancestor
• Phylogenetic tree
– Graph representing evolutionary
history of sequences / species
6
Source of image: http://superfrenchie.com/Pics/Blog/culture/evolution.jpg
Summer Bioinformatics Workshop 2008
Phylogenetics
• Premise
– Members sharing common evolutionary history
(i.e., common ancestor) are more related to each
other
– Can infer evolutionary relationship from shared
features
• Long history of phylogenetics
– Historically - based on analysis of observable features
(e.g., morphology, behavior, geographical distribution)
– Now - mostly analysis of DNA / RNA / amino acid
sequences
7
Summer Bioinformatics Workshop 2008
Phylogenetics
• Goals
– Understand relationship of sequence to similar sequences
– Construct phylogenetic tree representing evolutionary history
• Motivation / application
– Identify closely related families
• Use phylogenetic relationships to predict gene function
– Follow changes in rapidly evolving species (e.g., viruses)
• Analysis can reveal which genes are under selection
• Provide epidemiology for tracking infections & vectors
• Relationship to multiple sequence alignment (MSA)
– Alignment of sequences should take evolution into account
– More precise phylogenetic relationships  Improved MSA
– CLUTALW (http://www.ebi.ac.uk/clustalw/), a popular MSA
program, can produce alignment that is then used to build
phylogenetic tree.
8
Summer Bioinformatics Workshop 2008
Phylogenetic Tree Terminology
• Leaf / terminal node / taxon
– Node with no children
– Original sequence
• Join / internal node
– Point of joining two leaves / clusters
– Inferred common ancestor
• Branches
– Represent change
– Length represents evolutionary distance
• Cluster / clade
– All sequences in subtree with common
ancestor (treated as single node)
9
Summer Bioinformatics Workshop 2008
Phylogenetic Tree Terminology
• Binary tree
– Each edge that splits must connect to two children
• Rooted tree
– Contains a single ancestor of all nodes
– Evolution proceeds from root to leaves of tree
• Unrooted tree
– No single ancestor node
– No direction of evolution
• Molecular clock assumption (rooted tree)
– Mutations occur at constant rate
– Distance from root to leaves same for each leaf
10
Summer Bioinformatics Workshop 2008
Rooted and Unrooted Trees
Rooted Tree
Unrooted Tree
Orangutan
Orangutan
Human
Human
Chimpanzee
Chimpanzee
Gorilla
Direction of evolution
Gorilla
Root
11
Summer Bioinformatics Workshop 2008
Possible Ways of Drawing Tree
12
Summer Bioinformatics Workshop 2008
Applications –
Building Tree of
Life
13
Summer Bioinformatics Workshop 2008
14
Source: http://gi.cebitec.uni-bielefeld.de/people/boecker/bilder/tree_of_life_new.gif
Summer Bioinformatics Workshop 2008
Applications – Mammal Systematics
15
Source: http://www.isem.univ-montp2.fr/PPP/PM/RES/Phylo/Mamm/PHYLMOL-Placentalia%7EEnglish.jpg
Summer Bioinformatics Workshop 2008
Application –
Epidemiology (CSI!)
• Which patients
are more likely
infected by the
dentist?
16
Source: http://trc.ucdavis.edu/djbegun/Lect_12.1.html
Summer Bioinformatics Workshop 2008
Application – Modern
Human Evolution
• Based on mtDNA
genome
• Example
– Global mtDNA diversity
analysis (Ingman et al.,
2000 Nature. Volume
408:708-713)
– Africans have twice as
much diversity among them
as do non-Africans 
Africans have a longer
genetic history
– More recent population
expansion for non-Africans
– Africans and non-Africans
diverged recently
 Out of Africa
17
Source of image: Ingman et al., 2000, Nature. Volume 408: 708-713
Summer Bioinformatics Workshop 2008
Gene Tree vs.
Species Tree
• Gene typically
diverges before
speciation
• Phylogenetic tree
based on divergence
of one single
homologous gene
– Evolutionary history of
gene
– Gene tree rather than
species tree
• More genes are
needed to build
species trees
Source of image: http://www.bioinf2.leeds.ac.uk/b/genomics.html
18