Download Review of “Transposable elements have rewired the core regulatory

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Metagenomics wikipedia , lookup

Essential gene wikipedia , lookup

Quantitative trait locus wikipedia , lookup

RNA interference wikipedia , lookup

X-inactivation wikipedia , lookup

Public health genomics wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Genomics wikipedia , lookup

Gene desert wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Gene expression programming wikipedia , lookup

Oncogenomics wikipedia , lookup

Genomic library wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Pathogenomics wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Human genome wikipedia , lookup

Ridge (biology) wikipedia , lookup

RNA-Seq wikipedia , lookup

Microevolution wikipedia , lookup

Genomic imprinting wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

History of genetic engineering wikipedia , lookup

Non-coding DNA wikipedia , lookup

Genome editing wikipedia , lookup

Gene expression profiling wikipedia , lookup

NEDD9 wikipedia , lookup

Gene wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Genome (book) wikipedia , lookup

Transposable element wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Helitron (biology) wikipedia , lookup

Minimal genome wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Review of “Transposable elements
have rewired the core regulatory
network of human embryonic
stem cells”
RNA structure
Mutagenic screens
using transposons
Nature Genetics, 42(7), 631-635 (2010).
Bradly Alicea
http://www.msu.edu/~aliceabr
Introduction
H: transposable elements provide a means for cells to maintain
“stemness” in the face of genomic chaos (e.g. evolution).
“Stemness”: genes that maintain the stem
cell state (or pluripotency).
AACAGTGCT
AATAGTGCA
B
C
* pluripotentcy is ubiquitous among
eukaryotes (stemness is a meta-trait).
Three hallmarks/outcomes of evolution:
1)
diversity
recombination).
2)
homology
characteristics).
(mutational
(shared
change,
derived
3) “junk” DNA, genomic “dark matter”.
Vestigial retro-elements, genes, etc.
AAGAGTGCT
A
Large-scale changes are less common –
usually require large amounts of
evolutionary time (distant taxa).
Transposon definition
Two classes:
1) Retrotransposons (transposons via RNA intermediates)
* viral (HIV, so-called “junk” DNA).
http://www.microbiology
bytes.com/blog/tag/
retrovirus/
* LINEs, SINEs (long, short-interspersed repetitive elements)
2) DNA transposons (specific, non-specific binding)
* cut and paste
* copy and paste
“Mobile Genetic Elements of Malaria Vectors and Other
Mosquitoes” In M. Curie Bioscience Database, 2000.
http://www.microbiologybytes.com/
virology/Retroviruses.html
Hypotheses w.r.t. transposons:
LINEs, SINEs are symbionts of Eukaryotic genomes (Cytogenetics and Genome
Research, 110(1-4), 475-490 - 2005).
Human genome: 42% retrotransposons, 2-3% transposons (Nature, 409, 860-921 - 2001).
Transposon definition (con’t)
RIGHT: reciprocal translocations and large
inversions, Genes & Development, 23, 755765 (2009).
Maize adh1 gene:
* Mu3-induced mutation (Mu3 transposon, affects 430 bases in promoter region).
* adh1 expression vs. WT: upregulated in organ A, downregulated in organ B,
unchanged in organ C (mosaic-like).
* Fragmentation model for allelic diversity generation: promoter “scrambled”
during insertion and excision of transposon, expression pattern different from
insertion mutant or WT.
Transposon definition (con’t)
Q: why do transposons have a function in the genome? Not clear.
* example of functional role: Allen et.al, Nature Structural and Molecular
Biology, 11(9), 816 (2004).
Response to heat shock (transient exposure of cells to high temperatures):
* generalized response (rapid change in gene expression, chaperone activity).
* SINEs can encode B2 RNA in mouse (non-coding RNA PolIII), orchestrates
global downregulation of genes during generalized response.
Other potential functions:
1) Soper et.al, “Mouse Maelstrom, a component of nuage, is essential for
spermatogenesis and transposon repression in meiosis” (Developmental Cell,
2008).
2) Aravin et.al, “The PIWI piRNA pathway provides an adaptive defense in the
transposon arms race” (Science, 2007).
* transposons are “silenced” (default state), not representative of normal
functional variation. Can be further excised, silenced via recombination, splicing.
Results
In silico (bioinformatic) approach: Authors generated matching datasets: human
(ESCs) vs. mouse (ESCs).
* define a cross-species case (homologous, paralogous, conserved genes).
* Oct4 and NANOG (regulates "stemness“, experimental condition).
* CTCF (organizes regulatory blocks - chromosomal regions spanned by highly conserved
non-coding elements). CTCF serves as a control.
In vivo occupancy profiles for Oct4,
NANOG notably different in hESCs
of different species.
Frame b) percentile-wise partition of
binding site conservation between
species (top 10 to bottom 10).
* decreases massively in CTCF,
decreases from tiny amount of
homology in Oct4, NANOG.
Results (con’t)
ASSUMPTION: Low evolutionary conservation + similar DNA binding motifs
= small sequences that move around in genome.
* k-mers that are not in same genomic location when comparing species.
1) ChIP-Seq libraries generated for three factors, determined genome-wide
occupancy profile (full set of binding regions) in hESCs.
* enabled analysis of loci across range of enrichment levels.
* de novo motif-finding method used as confirmatory method.
* high similarity of DNA-binding specificity in human and mouse.
2) What proportion of regions occupied in one species is also occupied in the
other -- or, what is p(A|B)?
* for 1kb windows (OCT4 = 2%, NANOG = 1.9%, CTCF = 16.7%).
* for top 10% most enriched regions, changes in conservation (OCT4 = 3.8%,
NANOG = 5.3%, CTCF = 49.6%).
Results (con’t)
K-mer: a sequence of length k. Example: AACATTGGT (k = 9). In this paper, kmers are repeats that are mobile w.r.t. promoters of stemness genes.
* shorter k-mers, greater number of matches, more false positives.
* longer k-mers, smaller number of matches, fewer false positives.
In this paper (using word search):
* “and” (k = 3) appears 157 times. Many different contexts.
* “transposable” (k = 11) appears 15 times. Always matched to “element”.
Conserved Binding Regions:
* homologous regions determined by various
window sizes.
* generally, conservation increases as window
size increases (but not as much as effect of
specific genes).
Results (con’t)
Transposable elements = rich source for new binding sites? IDed specific transcription
factor-repeat associations that were more common than chance.
* 767 LTR9B repeats from endogenous retrovirus 1 (ERV1), 255 of these bound by OCT4 (move
around genome, Oct4 binding activity follows them).
Endogenous Retrovirus
(ERV) 1
* 82-fold enrichment (Observed = 33.2%, Expected =
0.4%), and example of repeat-associated binding sites
ERV1 = one of the
(RABS).
few
"active"
transposons
in
human
genome
(Sela et al. Genome
Biology
2010,
11:R59).
Endogenous = direct infection of germline
cells. ERVs are thus heritable.
Frame b: Fold-enrichment of Oct4 binding regions (IDed sequence). Categorized by
overlap with NANOG binding region, conserved in vivo, or RABS (4 repeat types).
* comparisons between proximity to up- and down-regulated
downregulated genes that overlap with ERV1).
regions (big effect for
Results (con’t)
RABS in binding region of selected genes:
Gene
RABS (%)
Oct4
20.9
NANOG
14.6
CTCF
11.1
NOTE: not exclusively in
regions that regulate
“stemness” genes.
* in fact, true only for a
minority of RABS.
* RABS represented among both strongly-and weakly-bound regions of CTCF
* RABS overrepresented among strongly-bound sites for OCT4, NANOG.
* ERV1 repeat family is largest contributor of RABS for OCT4 and NANOG.
In general, exaptation (see slide #14) among families of transposable elements
should be ubiquitous across evolution but species-specific.
Results (con’t)
Q: how can we be sure that similar binding elements are responsible for the uniform
regulation of “stemness” across species?
Pou5f1 RNAi treatment used to look at conservation issue further:
* following RNAi treatment, main stemness genes downregulated in mouse and human.
* SCGB3A2 downregulated, contains two binding regions in promoter bound by
OCT4 and NANOG which also overlap ERV1 repeats.
Highly expressed gene in
human
ESCs,
but
unregulated in mouse ESCs
* may be due to speciesspecific transposable elements.
Results (con’t)
Enrichment on either side (+/- 20kb) of transcriptional start site (TSS):
Supplementary Figure 1: as one moves away from the
center of bound region, sequence ID asymptotic to 0.
Sequence ID at center of bound region much higher in
CTCF (green).
* 53% conserved targets had a OCT4-NANOG binding region
* 15% were homogenously bound in mouse (other genes show evidence of binding site
turnover).
AEBP2 (encodes protein in PRC2 complex, important for self-renewal)
* exhibits binding site turnover, proximal promoter site in human overlaps with a repeat
site absent in mouse.
SOX2 (one of Yamanaka factors)
* has very well-conserved binding profile in both humans and mice for all three factors
(exception to this rule).
Results (con’t)
For 584 genes that only show downregulation in hESCs, 27% had OCT4-NANOG
binding region.
* fraction of binding regions corresponding to RABS (22.5%) higher than for
conserved targets (12.4%).
* using luciferase assay for activity of two ERV1 RABS, enhancer activity is
ablated if OCT4 motif is mutated.
* many genes rewired into core regulatory network of hESCs with insertion of
transposable elements.
50 genes added
to pluripotency
network.
21/44 genes and 3/6 TFs (many upregulated in
human, downregulated in mouse) with ERV1
RABS added into network (directly regulated by
Oct4, NANOG).
Broader Implications
Synteny: the order of genes on a chromosome.
Bioinformatics Blog: http://www.zer00ne.com/tag/synteny/
1) Meiotic recombination:
Father + Mother (shuffled chromosome),
synteny preserved.
Consequence: allelic diversity (evolutionary
change).
Archives of Gen.
Psychiatry, 57(12),
1105-1114 (2000).
2) Translocation:
Stretch of chromosome moved,
synteny not preserved.
Bithorax, many Lymphomas
Consequence: pathology, compromised function.
http://www.uams.edu/
radiology/info/clinical/
pet/images.asp
http://en.wikipedia.org/wiki/
File:Translocation-4-20.png
Action of transposons: more localized (part of regulatory, coding region), synteny
preserved, but still a large-scale change (in that it affects gene expression).
Consequence: adds transcriptional noise, fine-tunes the response of downstream
genes (for good and bad).
Broader Implications (con’t)
Exaptation: co-option of existing structures in evolution, may or may not be
driven by natural selection.
Retrotransposon
(viral element)
information-carrying
element with mobility
RESEMBLE
BINDING MOTIF
function-preserving
element
INSERT
AT PROMOTER
Evolvability: how does an organism evolve certain traits? What provides the
capacity for evolving certain traits?
* why do bats have wings, but not primates (same common ancestor)?
A number of potential mechanisms:
* redundancy: make sure a promoter exists for key “stemness” genes as genome changes
around it. Stochastic process - copy or move binding motif around genome with prob(x).
* “hopeful monster”: (large-scale changes, short evolutionary time – PNAS, 81, 5482).
* neutral processes: moved via genetic drift, hitchhiking (across evolutionary time).
Elitism and Stochasticity, Revisited
Competing models for reprogramming
(stochastic vs. deterministic):
1) stochastic: transformation occurs according to
a variable latency.
* time from trigger to transformation is variable
(cell cycle c = m transformations).
2)
deterministic:
transformation
according to a uniform latency.
occurs
*
time from trigger to transformation is
uniform.
Elite models argue that only a subset (1/n)
of cells will reprogram (innate ability).
This paper: elite, stochastic scenario (iv).