Download Evolutionary Genetics: Recurring Themes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Vectors in gene therapy wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Non-coding DNA wikipedia , lookup

Gene therapy wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Adaptive evolution in the human genome wikipedia , lookup

Transposable element wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Public health genomics wikipedia , lookup

Genomic library wikipedia , lookup

Copy-number variation wikipedia , lookup

History of genetic engineering wikipedia , lookup

Epistasis wikipedia , lookup

Human genome wikipedia , lookup

Gene nomenclature wikipedia , lookup

Population genetics wikipedia , lookup

Gene desert wikipedia , lookup

Pathogenomics wikipedia , lookup

Minimal genome wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene wikipedia , lookup

Genome (book) wikipedia , lookup

Gene expression profiling wikipedia , lookup

Koinophilia wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genomics wikipedia , lookup

Genome editing wikipedia , lookup

Gene expression programming wikipedia , lookup

RNA-Seq wikipedia , lookup

Helitron (biology) wikipedia , lookup

Designer baby wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Microevolution wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Anatomy of a Genome Project
A. Sequencing
1.
2.
3.
De novo vs. ‘resequencing’
Sanger WGS versus ‘next generation’ sequencing
High versus low sequence coverage
B. Assembly
1.
2.
Draft assembly
Gap closure
C. Annotation
1.
2.
3.
Gene, intron, RNA prediction
De novo vs. homology-based prediction
Assessing confidence
D. Comparison
1.
2.
3.
Comparing gene content, lineage specific gene loss, gain, emergence
Comparing genome structure (chromosomes, breakpoints, etc)
Comparing evolutionary rates of change (rates of amino-acid, nucleotide substitution)
1
Anatomy of a Genome Project: non-Model challenges
A. Sequencing
1.
2.
3.
De novo vs. ‘resequencing’ … resequencing not possible without a close, syntenic relative
Sanger WGS versus ‘next generation’ sequencing
High versus low sequence coverage … need high coverage and long reads (or mate-pair
reads to assemble)
B. Assembly
1.
2.
Draft assembly
Gap closure … time consuming no matter what
C. Annotation
1.
Gene, intron, RNA prediction
2.
De novo vs. homology-based prediction
3.
Assessing confidence
De novo predictions challenging if gene models are different in your species …
can rely less on homology for identifications and assessing confidence
D. Comparison
1.
2.
3.
Comparing gene content, lineage specific gene loss, gain, emergence
Comparing genome structure (chromosomes, breakpoints, etc)
Comparing evolutionary rates of change (rates of amino-acid, nucleotide substitution)
2
The power of comparison
For many non-model organisms, most of the predicted genes will be uncharacterized &
may not have homology to known genes.
But Comparison within and between species can still reveal interesting features
1.
Comparing gene content, lineage specific gene loss, gain, emergence
1.
Comparing genome structure (chromosomes, breakpoints, etc)
1.
Comparing evolutionary rates of change (rates of amino-acid, nucleotide
substitution)
1.
Comparing population data (SNPs, expression response, phenotypic variation …
mapping studies)
3
Science April 25, 2014
Tsetse fly: blood feeding insect that gives birth to live larvae & ‘lactates’
- 366 Mb genome = double the size of Drosophila melanogaster
- Identified orthologs across 5 insects … comparison of ortholog presence/absence
suggests unique evolutionary trajectories
- blood feeding evolved independently 12 times in Diptera … identified shared
proteins unique to several blood-suckers
- Some gene families have been expanded, others contracted in numbers … functional
annotations (“GO” = gene ontology predictions) suggestion selection
4
- sequenced 4 bat genomes & compared orthologs across 22 mammals
- used phylogenetic analysis and protein trees to identify cases of lineage-spec. evolution
5
To detect convergent evolution, look for proteins with unusual sequence relationships
Found ~2,300 genes with signatures of convergent evolution.
* enriched for genes linked to hearing, ear development, and … vison
6
The power of comparison
For many non-model organisms, most of the predicted genes will be uncharacterized &
may not have homology to known genes.
But Comparison within and between species can still reveal interesting features
1.
Comparing gene content, lineage specific gene loss, gain, emergence
1.
Comparing genome structure (chromosomes, breakpoints, etc)
1.
Comparing evolutionary rates of change (rates of amino-acid, nucleotide
substitution)
1.
Comparing population data (SNPs, expression response, phenotypic variation …
mapping studies)
7
8
Evolutionary Genetics Recap
Week
Week
Week
Week
Week
Week
Week
Week
Week
Week
Week
Week
Week
Week
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Phylogeny Primer, Anatomy of a genome project
Orthology, paralogy, and gene history
Evolution of new gene functions
Horizontal gene transfer
Whole genome duplication
Molecular evolution - Part I
Molecular evolution - Part II: genome-wide scans
QTL mapping
Evolution of gene expression
eQTL mapping & the cis-trans debate
Evolution of cis-regulatory motifs
Transcriptional Network Evolution
Evolutionary systems biology & Proteomics
Evolutionary biology in non-model organisms
9
Evolutionary Genetics: Recurring Themes
* Duplication facilitates change
- Duplications can be tandem, segmental, or whole genome
- Most duplications lost quickly through neutral (or selective) processes
- Facilitates subfunctionalization and neofunctionalization
- Baker et al. 2013 paper: paralog interference could drive evolution
- Benefits of duplication operate at all levels
- Gene duplication novel functions
- Gene duplication for novel regulation
- Gene duplication for novel network rewiring
- Regulatory element duplication for novel gene regulation
- Regulatory protein duplication for novel module regulation
- Regulatory system duplication for novel network rewiring
10
Evolutionary Genetics: Recurring Themes
* Biological systems are more plastic than we might think
- Much of the genome is under constraint from evolution
 purifying selection removes variation
- Many features of cellular systems appear to evolve, even if the cellular
function or output is conserved
 stabilizing selection can explain poor conservation of important
features, if the cell finds a ‘quick fix’ to maintain the phenotype
Examples: pervasive evidence of positive selection in fly and rodent
coding genes … transcription factor binding-site turnover
… phospho-site turnover … genetic/protein rewiring??
 strongest constraints may promote whole-sale rewiring as
stabilizing evolution (e.g. rewiring of ribosomal protein regulon)
De novo genes also appear to emerge frequently from the genomic ether
11
Evolutionary Genetics: Recurring Themes
* Evolutionary pressures vary over time and space
Neutral variation can suddenly become advantageous …
therefore accumulation of neutral variation can be a future conduit
Deleterious polymorphisms can be stabilized in the presence of other
polymorphisms
splitting up alleles by recombination can unmask deleterious alleles
12
Evolutionary Genetics: Recurring Themes
* Use a model for null/neutral expectation for your tests
- Likelihood ratio: comparing how likely one model is versus another
QTL analysis
motif model vs background model
selection model vs neutral model
etc, etc, etc
- Random sampling or simulations to assess what you expect by chance
- More complicated simulations (eg. coalescence)
This is especially true for whole-genome scans …
many things look striking until you do the statistics
13
Evolutionary Genetics: Recurring Themes
* Value of a phylogenetic perspective
- use the tree if you have one
* may not be the same tree across the entire genome
- inferring the state of the common ancestor can aid in analysis
Can be very useful for inferring evolutionary trajectory,
timing, order of events
14
Evolutionary Genetics: Recurring Themes
* Control for co-variates
Example: controlling for expression levels re. rate of protein evolution
Often hard to know what to even look/control for
* Best evidence if >1 test is significant
* Know your dataset
Know how the data were collected, what types of noise are associated
e.g. genome sequences by short-read deep sequencing
protein-protein interaction data
15
Evolutionary Genetics: Remaining Questions & Challenges
Epistasis & Environmental interactions
- how much does epistasis contribute in nature?
- challenges associated with gene-gene/gene-environment signals
Detecting signatures of selection, esp. recent/transient
- human evolution
- how will tests, statistics, caveats change with 10,000 genomes?
What is the relative contribution of adaptive vs. neutral evolution?
What is the relative contribution of regulatory vs. coding evolution?
What features contribute to the evolution of new forms and functions?
16