Download How Genes and Genomes Evolve

Document related concepts

Extrachromosomal DNA wikipedia , lookup

Gene desert wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Behavioural genetics wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Heritability of IQ wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Gene therapy wikipedia , lookup

Polyploid wikipedia , lookup

Point mutation wikipedia , lookup

Gene expression programming wikipedia , lookup

Mutation wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Population genetics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Copy-number variation wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Metagenomics wikipedia , lookup

Oncogenomics wikipedia , lookup

Gene wikipedia , lookup

RNA-Seq wikipedia , lookup

NUMT wikipedia , lookup

Transposable element wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Pathogenomics wikipedia , lookup

Minimal genome wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genetic engineering wikipedia , lookup

Non-coding DNA wikipedia , lookup

Designer baby wikipedia , lookup

Public health genomics wikipedia , lookup

Genomic library wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Helitron (biology) wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genome (book) wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genomics wikipedia , lookup

Human genetic variation wikipedia , lookup

Human genome wikipedia , lookup

Microevolution wikipedia , lookup

Human Genome Project wikipedia , lookup

Genome editing wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Welcome to Part 3 of Bio 219
Lecturer – David Ray
Contact info:
Office hours – 1:00-2:00 pm MTW
Office location – LSB 5102
Office phone – 293-5102 ext 31454
E-mail – [email protected]
Lectures are available online at
http://www.as.wvu.edu/~dray
go to ‘Teaching’ link
How Genes and Genomes
Evolve
Variation
• There is obviously variation among and
within taxa.
• How does the variation arise in genomes?
• Are there patterns to the variation?
• How is the variation propagated?
• What questions can be addressed using
the variation?
• What patterns exist in humans with regard
to genomic variability?
Generating Genetic Variation
• Somatic vs. germ line cells
– Somatic cells – “body” cells, no long term
descendants, live only to help germ cells
perform their function.
– Germ cells – reproductive cells, give rise to
descendants in the next generation of
organisms.
Generating Genetic Variation
• Somatic vs. germ line mutations
– Somatic mutations – occur in somatic cells
and will only effect those cells and their
progeny, cannot not be passed on to
subsequent generations of organisms.
– Germ mutations – can be passed on to
subsequent generations.
Generating Genetic Variation
• Five types of change contribute to
evolution.
– Mutation within a gene
– Gene duplication
– Gene deletion
– Exon shuffling
– Horizontal transfer – rare in Eukaryotes
Generating Genetic Variation
• Most changes to a genome are caused by
mistakes in the normal process of copying
and maintaining genomic DNA.
Generating Genetic Variation
• Mutations within genes
– Point mutations – errors in replication at
individual nucleotide sites occur at a rate of
about 10-10 in the human genome.
– Most point mutations have no effect on the
function of the genome – are selectively
neutral.
Generating Genetic Variation
• DNA duplications
– Slipped strand mispairing
– Unequal crossover during recombination
Generating Genetic Variation
• Gene duplication allows for the acquisition
of new functional genes in the genome
Generating Genetic Variation
• Gene Duplication: the globin family
– A classic example of gene duplication and evolution
– Globin molecules are involved in carrying oxygen in
multicellular organisms
– Ancestral globin gene (present in primitive animals)
was duplicated ~500 mya.
– Mutations accumulated in both genes to differentiate
them - α and β present in all higher vertebrates
– Further gene duplications produced alternative forms
in mammals and in primates
Primates
Mammals
Generating Genetic Variation
• Gene Duplication
– Almost every gene in the vertebrate genome
exists in multiple copies
– Gene duplication allows for new functions to
arise without having to start from scratch
– Studies suggest the early in vertebrate
evolution the entire genome was duplicated at
least twice
Generating Genetic Variation
• Exon Duplication
– Duplications are not limited to entire genes
– Proteins are often collections of distinct amino
acid domains that are encoded by individual
exons in a gene
– The separation of exons by introns facilitates
the duplication of exons and individual gene
evolution
Generating Genetic Variation
• Exon Shuffling
– The exons of genes can sometimes be
thought of as individual useful units that can
be mixed and matched through exon shuffling
to generate new, useful combinations
Review from last week
• Overall theme – There are lots of ways to create genetic variation.
Genetic variation is the basis of evolutionary change but the
variation must be introduced into the germ line to contribute to
evolutionary change.
• Two cell lines in multicellular organisms
– Somatic – short term genetic repository
– Germ line – long term genetic repository
• Variation that occurs in the germ line are the only ones that can
contribute to evolutionary change
• Genetic variation can be accumulated through various events
– Mutations in genes – point mutations
– DNA duplications – microsatellites (small), unequal crossover (large)
– Gene and exon duplications are the major method for generating new
gene functions
– Exon shuffling can produce new gene functions by creating new
combinations of functional exons/protein domains
Generating Genetic Variation
• Mobile elements contribute to genome
evolution in several ways
– Exon shuffling
– Insertion mutagenesis
– Homologous and non-homologous
recombination
Generating Genetic Variation
• What are mobile elements and how do
they work?
– Fragments of DNA that can copy itself and
insert those copies back into the genome
– Found in most eukaryotic genomes
– Humans – Alu (SINE); Ta, PreTa (LINEs);
SVA; plus several families that are no longer
active
Generating Genetic Variation:
Normal SINE mobilization
Reverse transcription
and insertion
Pol III transcription
1. Usually a single ‘master’ copy
2. Pol III transcription to an RNA intermediate
3. Target primed reverse transcription (TPRT) – enzymatic machinery
provided by LINEs
Generating Genetic Variation
• Mobile elements contribute to genome
evolution in several ways
– Exon shuffling
Generating Genetic Variation:
Exon shuffling via
SINE mobilization
exon 1
SINE
exon 2
intron
DNA copy of transcript
SINE
exon 2
SINE transcription can extend past the normal stop signal
Reverse transcription creates DNA copies of both the SINE and exon 2
Reinsertion occurs elsewhere in the genome
Generating Genetic Variation
• Mobile elements contribute to genome
evolution in several ways
– Exon shuffling
– Insertion mutagenesis
• The insertion of mobile elements can disrupt gene
structure and function
Generating Genetic Variation
ALU INSERTIONS AND DISEASE
LOCUS
BRCA2
Mlvi-2
DISTRIBUTION
de novo
de novo (somatic?)
SUBFAMILY
Y
Ya5
de novo
Familial
Ya5
Yb8
about 50%
Ya5
Familial
Y
Familial
one Japanese family
Ya5
Yb8
familial
Ya4
C1 inhibitor
ACE
de novo
about 50%
Y
Ya5
Factor IX
2 x FGFR2
GK
a grandparent
De novo
?
Ya5
Ya5
NF1
APC
PROGINS
Btk
IL2RG
Cholinesterase
CaR
Sx
DISEASE
Breast cancer
Associated with
leukemia
Neurofibromatosis
Hereditary desmoid
disease
Linked with ovarian
carcinoma
X-linked
agammaglobulinaemia
XSCID
Cholinesterase
deficiency
Hypocalciuric
hypercalcemia and
neonatal severe
hyperparathyroidism
Complement deficiency
Linked with protection
from heart disease
Hemophilia
Apert’s Syndrome
Glycerol kinase
deficiency
REFERENCE
Miki et al, 1996
Economou-Pachnis and
Tsichlis, 1985
Wallace et al, 1991
Halling et al, 1997
Rowe et al, 1995
Lester et al, 1997
Lester et al, 1997
Muratani et al, 1991
Janicic et al, 1995
Stoppa Lyonnet et al, 1990
Cambien et al, 1992
Vidaud et al, 1993
Oldridge et al, 1997
McCabe et al, (personal
comm.)
Generating Genetic Variation
• Gene expression alteration via a Pelement mobilization in Drosophila
Generating Genetic Variation
• Mobile elements contribute to genome
evolution in several ways
– Exon shuffling
– Insertion mutagenesis
• The insertion of mobile elements can disrupt gene
structure and function
– Homologous and non homologous
recombination
• 10,000 – 1,000,000 + nearly identical DNA
fragments scattered throughout the genome
Generating Genetic Variation
Unequal crossover due to
non-homologous recombination
ALU/ALU RECOMBINATION AND GERM-LINE DISEASE
LOCUS
8 x LDLR
5 x -globin
5 x C1 inhibitor
C3
HPRT
DMD
ADA
Ins. Rec.
Antithrombin
XY
Lysyl hydroxylase
DISTRIBUTION
DISEASE
REFERENCE
Kindreds
Hypercholesterolemia Lehrman et al, 1985, 1987
Yamakawa et al, 1989
Rudiger et al, 1991
Chae et al, 1997
Kindreds
Nicholls et al, 1987
-thalassaemia
Flint et al, 1996
Harteveld et al, 1997
Ko et al, 1997
Kindreds
Angioneurotic adema Stoppa-Lyonnet et al,
1990
Ariga et al, 1990
Kindred
C3 deficiency
Botto et al, 1992
Individual
Lesch-Nyhan
Marcus et al, 1993
syndrome
Kindred
Duchenne’s muscular Hu et al, 1991
dystrophy
Individual
ADA deficiency-SCID Markert et al, 1988
Individual
Insulin-independent Shimada et al, 1990
diabetes
Individual
Thrombophilia
Olds et al, 1993
Individual
XX male
Rouyer et al, 1987
Kindreds
Ehlers-Danlos
Pousi et al, 1994
syndrome
ALU/ALU RECOMBINATION AND CANCER
LOCUS
10 x
ALL-1
DISTRIBUTION
Somatic
MECHANISM
Alu-Alu recomb
Dup. intron 1-6
DISEASE
Acute
myelogenous
leukemia
7x
BCR/Abl
Somatic
X-Alu recomb.
CML
All-1/AF9
Somatic
Alu-Alu
translocation
2x
BRCA1
Somatic &
A kindred
2x
MLH1
2 kindreds
Alu-Alu recomb
(del exon 17; del.
Promoter)
Alu-Alu recomb.
(del exon 16)
(exons 13-16)
Acute
myelogenous
leukemia
Breast cancer
TRE
RB
EWS
Somatic
Interchromosomal
Alu-Alu recomb
Common
Alu-Alu recomb.
(799 bp del.)
Subset of Africans Alu-Alu recomb.
(del 2 kb)
REFERENCE
Strout et al, 1998
So et al, 1997;
Schichman et al,
1994
Jeffs et al, 1998
Chen et al, 1989
de Klein et al, 1986
Super et al, 1997
Ewing's sarcoma
Puget et al, 1997
Swensen et al,
1997
Nystrom-Lahti et
al, 1995
Mauillon et al,
1996
Onno et al, 1992
Association with
glioma
Protective against
Ewing Sarcoma?
Rothberg et al,
1997
Zucman-Rossi et
al, 1997
HNPCC
Generating Genetic Variation
• Gene transfer can move genes
between entire genomes
– Horizontal gene transfer
– Main problem with the development of
drug resistant strains of bacteria
Generating Genetic Variation
• Bacterial conjugation
Reconstructing Life’s Tree
• Evolutionary theory predicts that organisms that
are derived from a common ancestor will share
genetic signatures
• Organisms that shared an ancestor more
recently will be more similar than those that
shared a more distant common ancestor
• Similarity can include sequence composition,
genome organization, presence/absence of
mobile elements, presence/absence of gene
families, etc.
09_15_Phylogen.trees.jpg
09_16_Ancestral.gene.jpg
09_22_genetic.info.jpg
09_17_Human_chimp.jpg
Chromosome 1
Review from last time
• Overall themes: Genetic variation can be introduced due to the
activities and presence of mobile elements (MEs); Genetic
information can be introduced into organisms through horizontal
transfer.
• MEs are fragments of DNA that can make copies of themselves and
insert those copies back into the genome
– MEs can lead to variation through exon shuffling, insertion mutagenisis,
and recombination
– Many human diseases are the result of MEs
• Horizontal transfer can introduce genetic variation into bacteria via
the process of conjugation
• Introduction of concepts for discussion of “Reconstructing life’s tree”
– All sorts of variation provide information on the relationships among
organisms
– Homology – derived from the same ancestral source
– Phylogeny – a reconstruction of relationships based on observations
Reconstructing Life’s Tree
• Basic terms
– Homologous – derived from a common
ancestral source
– Phylogeny – a reconstruction of relationships
based on observed patterns
Reconstructing Life’s Tree
• Homologous genes can be recognized
over large amounts of evolutionary time
Reconstructing Life’s Tree
• Homologous genes can be recognized
over large amounts of evolutionary time
• Why?
– Selectively advantageous genes and
sequences tend to be conserved (preserved)
– Selectively disadvantageous genes and
sequences are tend not to be passed on to
offspring
Reconstructing Life’s Tree
• Most DNA of most genomes is non-coding
– Changes to much of this DNA are selectively neutral –
cause no harm or good to the genome
– Different portions of the genome will therefore diverge
at different rates depending on their function
The neutral regions tend to change in a clock-like
fashion
– We can estimate divergence times for certain groups
09_19_human_mouse1.jpg
Reconstructing Life’s Tree
• Most DNA of most genomes is non-coding
– Changes to much of this DNA are selectively neutral –
cause no harm or good to the genome
– Different portions of the genome will therefore diverge
at different rates depending on their function
• The neutral regions tend to change in a clocklike fashion
– We can estimate divergence times for certain groups
Reconstructing Life’s Tree
• The accumulation of changes can be
quantified by several logical methods
– Parsimony – the best hypothesis is the one
requiring the fewest steps (i.e. Occam’s razor)
– Distance – count the number of differences
between things, the ones with the fewest
numbers of differences are most closely
related
– Sequence based models – take into account
what we know about the ways sequences
change over time
Reconstructing Life’s Tree:
An example using distance
• These slides and the sequence files used
to produce them are available as a
supplement on the class website:
• DNA sequence from six taxa
Sumatran orang
Bornean orang
gorilla
bonobo chimp
common chimp
human
Reconstructing Life’s Tree:
An example using parsimony
ATGGCT
CAGGCT
AAGACG
CAGGCT
AAGACT
A-C
T-G
G-A
G-A
A-C
T-A
6 steps
Reconstructing Life’s Tree:
An example using parsimony
ATGGCT
CAGGCT
AAGACG
AAGACT
CAGGCT
T-G
G-A
G-A
T-A
A-C
5 steps
Reconstructing Life’s Tree:
An example using parsimony
ATGGCT
AAGACT
AAGACG
CAGGCT
CAGGCT
T-G
G-A
T-A
A-C
4 steps
Reconstructing Life’s Tree
• The accumulation of changes can be
quantified by several logical methods
• The accumulation of mobile elements
provides a nearly perfect record of
evolutionary relationships
Phylogenetic Inference Using SINEs
Phylogenetic Inference Using SINEs
Species A
Species B
Species C
Species D
Resolution of the Human:Chimp:Gorilla
Trichotomy
(H,C)G
(H,G)C
(C,G)H
(H,C,G)
Phylogenetic Analysis
PCR of 133 Alu loci
 117 Ye5
 13 Yc1
 1 Yi6
 1 Yd3
 1 undefined subfamily
PNAS (2003) 22:12787-91
Alu Elements and Hominid Phylogeny
PNAS (2003) 22:12787-91
Review from last time
• The variation that is present in genomes allows us to
make determinations about the relationships among
living things
• Different parts of the genome accumulate variation at
different rates depending on their function (or lack
thereof)
• The presence of different rates allows for different
questions to be addressed depending on the level of
divergence
• Several methods are available to analyze variation for
phylogenetic signal
– Parsimony, distance, sequence based models
• Patterns of mobile element insertion can be used to infer
relationships among taxa
Reconstructing Life’s Tree
• Much of the “junk” DNA is dispensible
– The Fugu (Takifugu rubripes) genome is
almost completely devoid of unnecessary
sequences
– Exon number and organization is similar to
mammals
– Compared to other vertebrates
• Intron size (not number) is reduced
• Intergenic regions are reduced in size
• No mobile elements
09_21_Fugu.introns.jpg
Reconstructing Life’s Tree
• Using all of the available information, we
can reconstruct relationships between
organisms back to the earliest forms of life
Our Own Genome
• The human genome is large and complex
– 23 pairs of chromosomes
– ~3.2 x 109 (3.2 billion) nucleotide pairs
– Human genome composition
09_26_noncoding.jpg
09_25_Chromosome22.jpg
Our Own Genome
• Nuclear genome
–3300 Mb
–23 (XX) or 24 (XY) linear chromosomes
–30-35,000 genes
–1 gene/40kb
–Introns
–3% coding
–Repetitive DNA sequences (45%)
Our Own Genome
• The human genome is large and complex
– 23 pairs of chromosomes
– ~3.2 x 109 (3.2 billion) nucleotide pairs
– Human genome composition
– The human genome project was one of the
largest undertakings in human history
Our Own Genome
• Progress in human genome sequencing
– Hierarchical vs. whole genome shotgun
(WGS) sequencing
– Repetitive DNA represents a significant
problem for WGS sequencing in particular
10_09_Shotgun.sequenc.jpg
08_03.jpg
Our Own Genome
• Progress in human genome sequencing
– Hierarchical vs. whole genome shotgun
(WGS) sequencing
– Repetitive DNA represents a significant
problem for WGS sequencing in particular
10_10_Repetit.sequence.jpg
Our Own Genome
• Progress in human genome sequencing
– Hierarchical vs whole genome shotgun
sequencing
– Repetitive DNA represents a significant
problem for WGS sequencing in particular
–Landmark papers in Nature and Science
(2001)
• Venter et al Science 16 February 2001; 291: 13041351
• Lander et al Nature 409 (6822): 860-921
Our Own Genome
• A typical highthroughput genomics
facility
Our Own Genome
• Exploring and exploiting the genome
sequences
• BLAST/BLAT and other tools
– BLAST - Basic local alignment search tool
• Input a sequence and find matches to human or
other organisms
– publication information
– DNA and protein sequence (if applicable)
Our Own Genome
• Exploring and exploiting the genome
sequences
• BLAST/BLAT and other tools
– BLAT – BLAST-like alignment tool
• A “genome browser”
• Genomes available :
– human, chimp, rhesus monkey, dog, cow, mouse,
opossum, rat, chicken, Xenopus, Zebrafish, Tetraodon,
Fugu, nematode (x3), Drosophila (x10), Apis (x3),
Saccharomyces (yeast), SARS
• example: chr6:121,387,504-121,720,836
Our Own Genome
• BLAT can be used to make direct comparisons between
our genome and others.
Query sequence - Callithrix
Human ortholog
Our Own Genome
• Comparisons with other genomes inform
us about our own
–Important genes and regulatory sequences
can easily be identified if they are conserved
between genomes
Our Own Genome
• Human variation
–~0.1% difference in nucleotide sequence
between any two individual humans
–Translates to about 3 million differences in the
genome
–Most of these differences are Single
Nucleotide Polymorphisms (SNPs)
–We can use these differences to investigate
human variation, population structure and
evolution
Our Own Genome
• Human evolution
–Coalescence analyses (mtDNA and Y
chromosome)
–Mutiregional vs. Out of Africa
• Predictions of the Multiregional Hypothesis
– Equal diversity in human subpopulations
– No obvious root to the human tree
• Predictions of the Out of Africa Hypothesis
– Higher diversity in African subpopulations
– Root of the human tree in Africa
Population Relationships Based on 100 Autosomal
Alu Elements
Africa
Asia
Europe
S. India
Our Own Genome
• Human evolution
–Higher diversity in African subpopulations
• Insulin minisatellite Table 12.6 in text
• 22 divergent lineages exist in the human
population
• All are found in Africa. Only 3 are found outside of
Africa.
Our Own Genome
• Interpreting the information generated by
the human genome project
–The complexity of genome function makes
interpretation difficult
–Ex. What are the regulatory sequences?
–Ex. Exons can be spliced together in different
ways in different tissues
09_30_alt.splice.RNA.jpg