Download FunctionalGenomicsEvolution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Point mutation wikipedia , lookup

Ridge (biology) wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Genetic engineering wikipedia , lookup

MicroRNA wikipedia , lookup

X-inactivation wikipedia , lookup

History of genetic engineering wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Copy-number variation wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Messenger RNA wikipedia , lookup

Pathogenomics wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Genomic imprinting wikipedia , lookup

Public health genomics wikipedia , lookup

RNA interference wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Gene therapy wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Genome (book) wikipedia , lookup

Gene desert wikipedia , lookup

NEDD9 wikipedia , lookup

Non-coding RNA wikipedia , lookup

RNA silencing wikipedia , lookup

Genome evolution wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Gene wikipedia , lookup

Helitron (biology) wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Primary transcript wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Gene nomenclature wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Epitranscriptome wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Designer baby wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene expression programming wikipedia , lookup

Microevolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene expression profiling wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
Gene Expression and Evolution
Why are Evolutionists Interested in
Gene Expression?
• Divergence in gene expression can underlie
differences between taxa
• Gene expression data enable critical tests of
several long-standing evolutionary concepts
(e.g., tradeoffs)
• Gene expression levels are heritable and can
be treated as bona fide quantitative traits
Techniques for Studying Gene Expression
• Traditional methods
- Western blot (protein level)
- Northern blot (mRNA level)
- RNase protection assay (mRNA level)
• PCR-based
- Semi-quantitative RT-PCR (mRNA level)
- Quantitative real-time RT-PCR (mRNA level)
• Genomic approaches
- Proteomics (protein level)
- Sequence counting techniques (mRNA level)
- Microarrays (mRNA level)
What Is Microarray Technology?
High throughput method for simultaneously
measuring mRNA abundances for
thousands of genes.
Thousands of probes or features
adhered to a solid substrate at
known x,y coordinates.
How Do Microarrays Work?
Hybridization Technique
- RNA is isolated from a
cell line or tissue of
interest, processed,
labeled, and hybridized
to probes.
- Label intensity at a
given location on the
substrate correlates with
the amount of a
particular transcript
expressed in the cell line
or tissue
Array Fabrication
• Many methods… a detailed discussion is beyond
the scope of this lecture
• Array fabrication always involves using robotic
work stations to adhere the appropriate nucleotide
sequences to a substrate… The length of the
sequences, spatial arrangement of sequences on
the grid, and nature of the substrate all vary
• Well designed arrays give multiple estimates for a
given gene and spread these estimates across the
substrate
Array Processing
• Hybridize processed and labeled RNA
samples to the array
- Denature
- Put in conditions that promote hybridization
- Wash
• Scan arrays with laser (Excite/Detect label)
• Image processing and spot quantification
Background
• Basic problem is that even after
performing washes…there will be
unevenness across the substrate in the
amount of non-specific label
• Background correcting seeks to make
intensities from any two parts of the
array comparable by estimating and
accounting for this unevenness
Normalization
• Even after background correcting…
Comparisons still must be made
between arrays…
• Normalization seeks to remove variation
between arrays that is due to technical
sources (e.g., scanning, batch effects,
etc.)
Creating an “Expression Measure”
• Well designed arrays have multiple features
interrogating a given transcript
• This dilutes the contribution of aberrant spots and is
likely to result in more accurate estimates of gene
expression
• These values must be summarized into an
“expression measure”
• Some strategies down-weight values that are further
from the mean
Sources of Variation in Microarray Experiments
Technical (Bad)
Biological
(1) RNA quality
(1) Experimental
(2) Dye biases
(2) Individual variation... may
or may not be good
(3) Stochasticity during scanning,
image processing
(5) Errors during probe synthesis or
deposition
(6) Stochasticity in labeling targets
Treatments
(3) Nonspecific hybridization
(e.g., paralogs of gene
families)
Designing Experiments
• The goal of most array experiments is to compare
RNA abundances between groups of interest (e.g.,
across populations, environmental conditions, or
developmental stages)
• Like all exercises in experimental biology… this
involves careful consideration of:
- How to remove extraneous sources of variation
- How to collect and analyze the data
Identifying Interesting Genes
• How can one objectively state that transcript levels for a
given gene differ among the groups of interest?
• Statistics!
- Allows one to attach a numerical value to the likelihood
that gene expression among groups is the same
- Ultimately, one describes differential expression in
terms of probabilities
• Examples of Statistical Tests (t-test, ANOVA, linear
regression)
The Burden of Multiple Testing
A given microarray may have over 40,000 probes!!!
This means that one may run > 40,000 statistical tests.
If α = 0.05, then 1 out of every 20 genes identified via
statistical tests is expected to be due to chance
alone.
If one runs 40,000 tests, then by chance alone he/she
will reject ~ 40,000 x 0.05 = 2000 true null
hypotheses (i.e., he/she will have ~ 2000 false
positives)
Gene Ontology & Biological Categorization
• Microarray datasets can be intimidating because they contain
A LOT of information
• Even experts on a system can be overwhelmed by the number of
genes that are differentially regulated in some experiments
• Having a standardized nomenclature that places a gene into one
or more biological contexts can be invaluable when one is trying
to make sense out of data on thousands of genes
Gene Ontology is a standardized
hierarchical nomenclature that
classifies genes under three
broad categories
Visualization, Categorization, & Multivariate Statistics
Clustering
Principal Component Analysis
Classification
Discriminant Analysis
From PNAS 102(21)
Machine Learning
Transcriptional Networks & Graph Theory
From Nature Genetics 41(5)
Comparisons Across Taxa
• Comparisons are often made between closely
related taxa using array technology
• Such comparisons can yield fascinating
insights into gene expression differences
between species
• However, sequence divergence between
species in the gene regions targeted by
microarray probes can be a major hurdle to
data interpretation
Heterologous Hybridization
• Hybridizing RNA isolated from one species to an
array whose probes were designed from another
species
• Major concern is cross (i.e., non-specific)
hybridization and poor hybridization due to
sequence mismatch… Reduces correlation
between signal and transcript abundance
• Care must be taken to identify conserved
features on the array
eQTLs & Genetical Genomics
From Skelly et al. 2009
Conclusions of eQTL Studies
• Transcriptional variation is often
heritable
• Heritable transcriptional variation is
frequently polygenic and often has a
complex genetic architecture
Examples From the Voss Lab
Terrestrial Adult
Metamorphosis
Eastern Tiger
Salamander
Egg
Aquatic Larva
Aquatic Adult
Paedomorphosis
Mexican Axolotl
Parental Species: Growth
Axolotl
Tiger
R2 = 0.957
Mexican Axolotl vs. Eastern Tiger Salamander
• Whole brain from axolotl and eastern tiger salamander
• Sampled at 2 week intervals (42, 56, 70, 84 DPH)
• Three replicate chips per species per time point
• Three animals (brains) per chip
Parental Species: Gene Expression in the Brain
Axolotl
Tiger
NGFRm/m
NGFRm/t
From Voss and Smith
2005
Backcross: Growth
NGFRm/m
NGFRm/t
R2 = 0.972
Comparative Genomics
Backcross: Gene Expression in the Brain
NGFRm/m
NGFRm/t
Backcross: Gene Expression in the Brain
Finer Scale Local Map