Download Gen660_Lecture3B_GeneEvolution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Ridge (biology) wikipedia , lookup

Genomic imprinting wikipedia , lookup

Non-coding DNA wikipedia , lookup

List of types of proteins wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Mutation wikipedia , lookup

Gene expression wikipedia , lookup

Protein moonlighting wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Copy-number variation wikipedia , lookup

Point mutation wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Gene nomenclature wikipedia , lookup

Gene desert wikipedia , lookup

RNA-Seq wikipedia , lookup

Molecular ecology wikipedia , lookup

Community fingerprinting wikipedia , lookup

History of molecular evolution wikipedia , lookup

Gene expression profiling wikipedia , lookup

Gene wikipedia , lookup

Genome evolution wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Gene regulatory network wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Molecular evolution wikipedia , lookup

Transcript
Evolution of gene function
1.
2.
Divergent evolution: the importance of gene duplication
a. Ohno’s model
b. Subfunctionalization
c. Neofunctionalization
Introducing novelty: generation of entirely new proteins/functions
a. Lateral gene transfer
b. Domain fusion
c. Intron junction evolution?
d. New genes through TEs?
Evolution of gene function
First, some key basic concepts:
Selection acts on phenotypes, based on their fitness cost/advantage, to affect
the population frequencies of the underlying genotypes.
In the case of DNA sequence:
• Neutral substitutions = no effect on fitness, no effect on selection
• Deleterious substitutions = fitness cost
* These are removed by purifying (negative) selection
• Advantageous substitutions = fitness advantage
* These alleles are enriched for through adaptive (positive) selection
The Neutral Theory
M. Kimura, 1968
Most DNA substitutions are likely to be neutral = no effect on fitness.
They arise through new mutations.
Given a ~constant mutation rate, can convert the # of substitutions into
time of divergence since speciation = molecular clock theory.
Neutral changes evolve by genetic drift, not natural selection.
* Most are probably lost, some can become fixed in the population
Purifying selection to remove deleterious changes is pervasive,
while positive selection may be relatively rare.
The Nearly-Neutral Theory
T. Ohta, 1973
Many slightly deleterious (or slightly advantageous) substitutions are not
selected against efficiently if population sizes are large.
* Small populations are more subject to drift (e.g. random events).
* Selection is ‘slower’ in large populations … therefore many weakly
deleterious substitutions have yet to be removed by selection.
Therefore, many substitutions that are nearly neutral can evolve mostly by drift.
** Practically what this means is that SOME substitutions found
in extant sequences can be slightly deleterious & have yet to be removed
Most genes are under constraint = many substitutions are deleterious and
therefore removed through purifying selection
Constraint can be due to maintaining:
* Protein function (e.g. catalytic site)
* Protein folding & stability
* Interactions with other proteins, molecules
* Other features like translation efficiency, RNA folding, etc.
Then how do new functions emerge? How can proteins evolve?
Evolution of gene function
1.
2.
Divergent evolution: the importance of gene duplication
a. Ohno’s model
b. Subfunctionalization
c. Neofunctionalization
Introducing novelty: generation of entirely new proteins/functions
a. Lateral gene transfer
b. Domain fusion
c. Intron junction evolution?
d. New genes through TEs?
Most functions evolve through divergent evolution due to relaxed constraint
Susumu Ohno (1970): gene duplication is the main route to neofunctionalization,
where one copy is allowed to evolve an entirely new function.
1. Gene duplication
x
x
xx
2. Brief period of complete redundancy
& relaxed constraint for both genes
x
x
x xx
X
3. Often one copy is lost as a pseudogene
x
x
x xx
3. Or one copy can evolve a new function
Most functions evolve through divergent evolution due to relaxed constraint
Susumu Ohno (1970): gene duplication is the main route to neofunctionalization,
where one copy is allowed to evolve an entirely new function.
1. Gene duplication
x
x
xx
2. Brief period of complete redundancy
& relaxed constraint for both genes
x
x
x xx
X
3. Often one copy is lost as a pseudogene
x
x
x xx
3. Or one copy can evolve a new function
Force & Lynch (1999) formalized the concept of subfunctionalization,
where both copies evolve and the ancestral function becomes
split between the paralogs
1. Segmental (dispersed) duplication & recombination (Homologous or Illegitimate)
Segments often flanked by
repetitive sequence
2. Tandem duplication through replication slippage
3. Duplication through retrotransposition (= loss of introns & flanked by repeats)
4. Whole-genome duplication (WGD, covered in Lecture 5)
Once a gene has been duplicated, gene conversion through recombination can obscure rates
Science 2000
* Identified gene duplicates (BLAST) in 9 taxa
* Dated duplicates based on # of silent substitutions (molecular clock)
Ks (sometimes called Ds ): # of silent substitutions that encode SAME (synonymous) codon
* often these changes are ASSUMED to be neutral**
* given a constant rate of point mutations, Ks can be used to date a sequence
** now people realize that Ks can also be constrained by other things besides codon
Ka (sometimes called DN ): # of substitutions encoding a nonsynonymous codon
The Ka/Ks ratio: a measure of constraint on coding sequences
If we assume that Ks reflects the underlying neutral rate of change:
Ka/Ks = 1 …. Rate of codon changes is the same as rate of silent changes
* taken to mean NO constraint on gene sequence
Ka/Ks < 1 …. Rate of codon changes is LESS than the rate of neutral change
* implies deleterious codon changes were removed by purifying selection
* therefore implies constraint on gene sequence
Ka/Ks > 1 …. Rate of codon changes is the GREATER than rate of silent changes
* implies codon changes have been selected for by positive selection
Ks can also be used to date the age of sequences according to the
‘molecular clock’ theory
Science 2000
* Identified gene duplicates (BLAST) in 9 taxa
* Dated duplicates based on # of silent substitutions (molecular clock)
* Measured several features over ‘time’ (# silent substitutions) to show:
Lynch & Conery 2000
Science 2000
* Identified gene duplicates (BLAST) in 9 taxa
* Dated duplicates based on # of silent substitutions (molecular clock)
* Measured several features over ‘time’ (# silent substitutions) to show:
• Duplicates experience brief window of relaxed constraint before reintroduction of purifying selection
• Average half-life of gene duplicates is ~4 million years
• In yeast and drosophila: rate of gene duplication: 0.002 - 0.02 per gene per million years,
depending on species (e.g. if 13,000 genes = 31 new duplicates per genome per million years)
… may be inflated if gene conversion makes ancient duplicates appear ‘young’
** The estimated rate of gene duplication is on the same order as rate of new mutations!
Fate of gene duplicates
1. Lost as a pseudogene
x
x
x xx
2. Neofunctionalization
x
x
x xx
3. Subfunctionalization
x
X
x
4. Retained & conserved
* Can be maintained due to advantage of increased dosage
* Can promote regulatory innovation
x xx
De novo creation of new genes
1.
Retrotransposition (+/- cooption of other sequences)
Often see short flanking repeats due to mechanism of TE integration
Integration into the genome (in NUCLEUS)
Reverse transcription by TE polymerases
(in CYTOSOL)
AAAAA
Splicing to remove intron
AAAAA
Pre-mRNA
De novo creation of new genes
1.
Retrotransposition (+/- cooption of other sequences)
2.
Gene duplication into other sequences = chimeric structure/regulation
De novo creation of new genes
1.
Retrotransposition (+/- cooption of other sequences)
2.
Gene duplication into other sequences = chimeric structure/regulation
3.
Cooption of non-coding DNA (from introns, intergenic sequence)
De novo creation of new genes
1.
Retrotransposition (+/- cooption of other sequences)
2.
Gene duplication into other sequences = chimeric structure/regulation
3.
Cooption of non-coding DNA (from introns, intergenic sequence)
4.
Horizontal gene transfer (very prevalent in bacteria: Lecture 5)
- also observed from bacterial parasites to insect hosts
Challenge in distinguishing Novel Gene vs. missed orthology due to rapid evolution