Download Transcriptional Regulation II

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Signal transduction wikipedia , lookup

Cellular differentiation wikipedia , lookup

Histone acetylation and deacetylation wikipedia , lookup

List of types of proteins wikipedia , lookup

Transcription factor wikipedia , lookup

JADE1 wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene regulatory network wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Transcript
CS173
Lecture 8: Transcriptional regulation II
MW 11:00-12:15 in Beckman B302
Prof: Gill Bejerano
TAs: Jim Notwell & Harendra Guturu
http://cs173.stanford.edu [BejeranoWinter12/13]
1
Announcements
• HW1 due today. Thoughts and comments?
• HW2 will be out by midnight
• Halfway feedback today
http://cs173.stanford.edu [BejeranoWinter12/13]
2
Announcements
http://cs173.stanford.edu [BejeranoWinter12/13]
3
ATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATA
TATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTC
TAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTC
TGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACT
CTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATG
AATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAA
GCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAAT
TTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAA
CTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGG
TTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGAT
TGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAAT
TTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATG
CGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATC
ATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAA
GAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCA
ATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAA
TTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGA
ATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTT
ATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTTGCGAAGTT
TGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGT
TCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATAC
ATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCT
GCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTA
CGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGA
ATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACA
TCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAAC
GGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAA
CTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTG
GCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTC
TTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAAT
TGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCT
GCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTT
AATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCT
TCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTT
AATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGA
TTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTA
CTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTT
TACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTT
ACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAA
http://cs173.stanford.edu [BejeranoWinter12/13]
4
AATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGT
Gene Regulation
Some proteins and non
coding RNAs go “back”
to bind DNA near
genes, turning these
genes on and off.
Gene activation:
Gene
DNA
Proteins
http://cs173.stanford.edu [BejeranoWinter12/13]
5
Transcription Activation contd.
http://cs173.stanford.edu [BejeranoWinter12/13]
6
Transcription Activation
Terminology:
• RNA polymerase
• Transcription Factor
• Transcription Factor Binding Site
• Promoter
• Enhancer
• Gene Regulatory Domain
TF
DNA
http://cs173.stanford.edu [BejeranoWinter12/13]
7
Transcription activation “loop”
Transcription factors bind DNA, turn on or off different promoters and
enhancers, which in-turn turn on or off different genes, some of which
may themselves be transcription factors, which again changes the
presence of TFs in the cell, the state of active promoters/enhancers etc.
Proteins
DNA
transcription factor
binding site
Gene
DNA
http://cs173.stanford.edu [BejeranoWinter12/13]
8
IFN beta enhancer
http://cs173.stanford.edu [BejeranoWinter12/13]
9
Transcription Measurements
Some measurement techniques:
• Chromatin Immunoprecipitation
• Transcription output:
– Transfection
– Transgenics
– Genome Engineering
• Chromosome Conformation Capture
http://cs173.stanford.edu [BejeranoWinter12/13]
10
Transcription Activation Properties
Observed Properties:
• Most TF binding site basepair preferences
are independent of each other.
• TFs can synergize to turn gene activity on.
• Behavior can change in different conditions.
• TFs bind to hundreds and thousands of
different targets in a single condition.
• Enhancers complement in different tissues.
http://cs173.stanford.edu [BejeranoWinter12/13]
11
Gene Regulation is HOT
Gene regulation is currently one of the hottest topics in the
study of the human genome.
Large projects are pouring lots of money to generate large
descriptive datasets.
The challenge now is to glean logic from these piles.
Measured >100 TFs
in >70 cellular conditions.
How does TF binding determine its output: gene expression?
http://cs173.stanford.edu [BejeranoWinter12/13]
12
System output measurements
Measure non/coding gene expression!
1. First generation mRNA (cDNA) and EST sequencing:
In UCSC Browser:
http://cs173.stanford.edu [BejeranoWinter12/13]
13
2. Gene Expression Microarrays (“chips”)
http://cs173.stanford.edu [BejeranoWinter12/13]
14
3. RNA-seq
“Next” (2nd) generation sequencing.
http://cs173.stanford.edu [BejeranoWinter12/13]
15
Gene Finding II: technology dependence
Challenge:
“Find the genes, the whole genes, and nothing but the genes”
We started out trying to predict genes directly from the genome.
When you measure gene expression, the challenge changes:
Now you want to build gene models from your observations.
These are both technology dependent challenges.
The hybrid: what we measure is a tiny fraction of the space-time state
space for cells in our body. We want to generalize from measured
states and improve our predictions for the full compendium of states.
http://cs173.stanford.edu [BejeranoWinter12/13]
16
4. Spatial-temporal maps generation
http://cs173.stanford.edu [BejeranoWinter12/13]
17
Gene Expression Causality
Measuring gene expression over time provides sets of
genes that change their expression in synchrony.
• But who regulates whom?
• Some of the necessary regulators may not change their
expression level when measured, and yet be essential.
“Reading” enhancers can provide gene regulatory logic:
• If present(TF A, TF B, TF C) then turn on nearby gene X
http://cs173.stanford.edu [BejeranoWinter12/13]
18
Some Computational Challenges in Gene Regulation
Transcription factor binding site discovery
• Technology-dependent challenge in
constructing the correct binding site model
(e.g. motif) from the measurements.
• Eg, ChIP produces sequences of 100-200bp.
Your motif of length 4-20 is there somewhere.
• Find the most enriched model in the set of
sequences you obtained.
• Methods range between full enumeration,
heuristic/probabilistic searches, and hybrids.
http://cs173.stanford.edu [BejeranoWinter12/13]
19
Transcription factor motif discovery: different technologies
SELEX = Systematic Evolution of Ligands by Exponential Enrichment
PBM = Protein Binding Microarrays
http://cs173.stanford.edu [BejeranoWinter12/13]
20
Transcription factor binding site prediction
Given the genome, and possibly some cell measurements
predict (all and nothing but) the binding sites of a given
transcription factor (in a/all context/s).
http://cs173.stanford.edu [BejeranoWinter12/13]
21
Enhancer Prediction
How do TFs “sum” together to
provide the activity of an enhancer?
A network of genes?
http://cs173.stanford.edu [BejeranoWinter12/13]
22
Enhancer Prediction
Given a sequence of DNA predict:
• Is it an enhancer? Ie, can it drive gene expression?
• If so, in which cells? At which times?
• Driven by which transcription factor binding sites?
Given a set of different enhancers driving expression in the
same population of cells:
• Do they share any logic? If so what is it?
• Can you generalize this logic to find new enhancers?
http://cs173.stanford.edu [BejeranoWinter12/13]
23
Biology is empirical: you predict, and you measure!
Measuring is great. It allows you to check your
assumptions and improve your models until you get it.
Some difficulties associated with gene regulation:
• Single cell measurements are rare. You most often
measure some “average” over a population of cells.
• The population of cells is seldom in sync (same state).
• The closer a population of cells is to its in vivo state the
less homogeneous it is.
• The closer a population of cells is to its in vivo state the
harder (time, effort, money) it is to measure it.
http://cs173.stanford.edu [BejeranoWinter12/13]
24
Biology is empirical, you predict, and you measure!
Some more difficulties associated with gene regulation:
• A family of TFs often has very similar binding motifs
• Expression pattern may be different (but unknown to you).
• Family members may have different protein-protein
interaction (PPI) domains which are also important.
• The genome is pleiotropic ( = good for all contexts).
• If an enhancer you are studying is in fact good for multiple
contexts they will be overlaid on each other in sequence and
make prediction (and disentanglement) harder.
http://cs173.stanford.edu [BejeranoWinter12/13]
25
Transcription Factors Large “fan outs” revisited
TFs reproducibly bind to thousands of genomic locations
almost anywhere we’ve looked.
Gene regulation forms a dense network.
However, when such a TF is perturbed (over expressed or
silenced) only a fraction of the genes it binds next to
change their expression levels.
http://cs173.stanford.edu [BejeranoWinter12/13]
26
Genomics vs. Genetics
Last but not least – genomics is descriptive.
It can show you “everything”.
Eg: all the location a given transcription factor is bound to
the genome (reproducibly) in a given cell state.
Which of these bindings actually matters?
frequency
no or near no effect
adverse effect on cell
adverse effect
observable in
experiments
effect on cell
Function
Assay
Binds reproducibly
Relatively easy
Changes expression of
nearby genes
Hard
Affects cell/organism
function/fitness
Very hard
Affects cell/organism
Harder still
but not where/when I
looked for it (pleiotropy)
http://cs173.stanford.edu [BejeranoWinter12/13]
27
Transcription factors “rule”
Cellular reprogramming
is done by adding to
the cell large quantities
of a small number of
the “right” TFs.
These somehow “reset”
cell state.
We have learned (in a dish) to:
1 control differentiation
2 reverse differentiation
3 hop between different states
http://cs173.stanford.edu [BejeranoWinter12/13]
28
Transcription Regulation
is not just about activation
http://cs173.stanford.edu [BejeranoWinter12/13]
29
Transcriptional Repression
An equally important but less visible part of
transcription (tx) regulation is transcriptional
repression (that lowers/ablates tx output).
• Transcription factors can bind key genomic
sites, preventing/repelling the binding of
– The RNA polymerase machinery
– Activating transcription factors
(including via competitive binding)
• Some transcription factors have stereotypical
roles as activators or repressors. Likely many
can do both (in different contexts).
• DNA can be bent into 3D shape preventing
enhancer – promoter interactions.
• Activator and co-activator proteins can be
modified into inactive states.
Note: repressor thus can relate to specific
DNA sequences or proteins.
http://cs173.stanford.edu [BejeranoWinter12/13]
30
Transcriptional Output Prediction
All these can increase or decrease tx output:
• Adding/repressing different proteins
• Modifying DNA bases
• Adding genomic context
• Changing cellular context
Repression logic is harder to tease out.
(need positive controls)
http://cs173.stanford.edu [BejeranoWinter12/13]
31
Transcription can only happen in open Chromatin
Chromatin / Proteins
Genome packaging
in fact provides a
critical layer of gene
regulation.
http://cs173.stanford.edu [BejeranoWinter12/13]
DNA / Proteins
32
Gene Activation / Repression via Chromatin Remodeling
A dedicated machinery opens and closes chromatin.
Interactions with this machinery turns genes and/or gene
regulatory regions like enhancers and repressors on or off
(by making the genomic DNA in/accessible)
http://cs173.stanford.edu [BejeranoWinter12/13]
33
Insulators
Insulators are DNA sequences that when
placed between target gene and enhancer
prevent enhancer from acting on the gene.
•Known insulators contain binding sites for a
specific DNA binding protein (CTCF) that is
involved in DNA 3D conformation.
•However, CTCF fulfills additional roles
besides insulation. I.e, the presence of a
CTCF site does not ensure that a genomic
region acts as an insulator.
TSS1
TSS2
Insulator
http://cs173.stanford.edu [BejeranoWinter12/13]
34
Cis-Regulatory Components
Low level (“atoms”):
• Promoter motifs (TATA box, etc)
• Transcription factor binding sites (TFBS)
Mid Level:
• Promoter
• Enhancers
• Repressors/silencers
• Insulators/boundary elements
• Cis-regulatory modules (CRM)
• Locus control regions (LCR)
High Level:
• Epigenetic domains / signatures
• Gene expression domains
• Gene regulatory networks (GRN)
http://cs173.stanford.edu [BejeranoWinter12/13]
35
Signal Transduction
Everything we discussed so far happens within the cell.
But cells talk to each other, copiously.
http://cs173.stanford.edu [BejeranoWinter12/13]
36
Gene Regulation II
Chromatin / Proteins
To be continued…
Extracellular signals
DNA / Proteins
http://cs173.stanford.edu [BejeranoWinter12/13]
37
(On Mondays) ask students to stack
the chairs without wheels at the back
of the room at the end of class.
http://cs173.stanford.edu [BejeranoWinter12/13]
38
http://cs173.stanford.edu [BejeranoWinter12/13]
39