Download translational - Bioinformatics Institute

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Polycomb Group Proteins and Cancer wikipedia , lookup

Nutriepigenomics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Transposable element wikipedia , lookup

Gene expression profiling wikipedia , lookup

Microevolution wikipedia , lookup

Frameshift mutation wikipedia , lookup

DNA polymerase wikipedia , lookup

Human genome wikipedia , lookup

MicroRNA wikipedia , lookup

Replisome wikipedia , lookup

Point mutation wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

NEDD9 wikipedia , lookup

Transcription factor wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Non-coding DNA wikipedia , lookup

Expanded genetic code wikipedia , lookup

RNA interference wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Gene wikipedia , lookup

RNA world wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Genetic code wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Nucleic acid tertiary structure wikipedia , lookup

RNA silencing wikipedia , lookup

RNA wikipedia , lookup

Transfer RNA wikipedia , lookup

Polyadenylation wikipedia , lookup

RNA-Seq wikipedia , lookup

History of RNA biology wikipedia , lookup

Messenger RNA wikipedia , lookup

RNA-binding protein wikipedia , lookup

Non-coding RNA wikipedia , lookup

Primary transcript wikipedia , lookup

Epitranscriptome wikipedia , lookup

Transcript
Computational Biology I
LSM5191
Aylwin Ng, D.Phil
Lecture 2 Notes:
Molecular Biology of Gene Expression.
Flow of information: DNA to polypeptide
Start
DNA
Exon1
Intron
Exon2
Termination
Transcription
Addition of 5’cap
m7Gppp
Cleavage & addn of polyA tail at 3’end
m7Gppp
A…(A)200
RNA splicing
m7Gppp
A…(A)200
Transport to cytoplasm
Translation
Polypeptide
TRANSCRIPTION
TRANSCRIPTION – An Overview
BACTERIAL GENE EXPRESSION
• In prokaryotes, genes encoding proteins involved in related functions often are
located next to each other in bacterial chromosomes.
• This cluster of genes comprise a single transcription unit called an OPERON.
• i.e., a single mRNA molecule contains the full set of genes of the operon.
• Hence, prokaryotic mRNA encodes several polypeptides and is therefore polycistronic (a cistron is defined as a genetic unit that encodes a single
polypeptide).
• Poly-cistronic mRNA contains multiple ribosome-binding sites near start sites for
all protein coding regions in the mRNA.
BACTERIAL GENE EXPRESSION (cont.d)
ƒ By convention, the transcription-initiation site in the DNA sequence
is designated +1, and base pairs extending in the direction of
transcription (downstream) are assigned positive numbers which
those extending in the opposite direction (upstream) are assigned
negative numbers.
ƒ Various proteins (RNA polymerase, activators, repressors) interact
with DNA at or near the promoter to regulate transcription initiation.
E. coli PROMOTER SITES
• 2 regions (-10 and –35 regions) in most E. coli promoters are critical
for binding RNA polymerase holoenzyme (β’,β,α,α,σ70) via its σ70
subunit (or initiation factor).
• After holoenzyme transcribes approx 10 bp, σ70 subunit is released.
• The core RNA polymerase (β’,β,α,α) continues transcribing (chain
elongation).
• ‘Strong’ promoters = promoters at which RNA pol initiates
transcription at high frequency (dependent on enzyme’s affinity for
promoter).
Identification of PROMOTER SITES
• DNase I footprinting assays identify protein-DNA interactions.
• DNase I randomly hydrolyses phosphodiester bond.
• Low concentration of DNase I used Æ on average each DNA molecule is cleaved
just once.
3’
G
G
A
T
C
A
EUKARYOTIC TRANSCRIPTION
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
The basic principles that control transcription in bacteria also apply to
eukaryotic organisms.
Transcription is initiated at a specific base pair and is controlled by the
binding of trans-acting proteins (transcription factors) to cis-acting
regulatory DNA sequences.
However, eukaryotic cis-acting elements are often much further from the
promoter they regulate, and transcription from a single promoter may be
regulated by binding of multiple transcription factors to alternative control
elements.
Transcription control sequences can be identified by analysis of a 5′deletion series.
Unlike prokaryotes, eukaryotes have 3 (instead of just one) RNA
polymerases (all large multi-subunit enzymes).
Eukaryotic mRNAs are generally monocistronic. Prokaryotic mRNAs
are generally polycistronic, i.e. several polypeptides are translated from
the same mRNA.
EUKARYOTIC RNA POLYMERASES
RNA Polymerase I:
• Transcribes gene encoding ribosomal RNA (45S precursor yielding
28S, 18S, 5.8S rRNAs)
RNA Polymerase II:
• Transcribes all protein-coding genes,
• Transcribes genes encoding small nuclear RNAs U1, U2, U3 etc.
RNA Polymerase III:
• Transcribes genes encoding transfer RNA,
• Transcribes gene encoding 5S rRNA,
• Transcribes gene encoding snRNA U6.
RNA POLYMERASE I
• Essential protein factors (rather than the polymerase) recognise DNA sequences
around transcription start site.
• Key sequences recognised by these factors are located within 50 bases upstream
of start site.
• SL1 factor recruits RNA polymerase I.
UBF
UBF = Upstream Binding Factor
-50
+1
+50
UBF SL1
-50
+1
+50
UBF SL1 RNA pol I
-50
+1
Transcription
+50
RNA POLYMERASE II
TFIIA TFIID
-50
TATA
+1
TFIIA TFIID TFIIB
-50
TATA
RNA pol II
+1
TFIIF
TFIIA TFIID TFIIB
-50
TATA
Brief Notes:
• TFIIB recruits RNA pol II.
• TFIIH phosphorylates C-term domain
of RNA pol II.
• Phosphorylated form is able to initiate
transcription.
+1
TFIIETFIIH
RNA pol II
RNA pol II
TFIIF
TFIIF
TFIIA TFIID TFIIB
-50
TATA
TFIIA TFIID
+1
-50
TATA
+1
Transcription
RNA
POST-TRANSCRIPTIONAL EVENTS
in EUKARYOTES
ƒ Capping
ƒ Polyadenylation
ƒ RNA splicing
ƒ RNA transport
ƒ Translation
CAPPING
• Capping only occurs in Eukaryotes!
• 5’ end of nascent mRNA is modified,
• Addition of a Methylated Guanylate residue (NOT
encoded by DNA).
• Rxn catalysed by guanylyl transferase.
• 3 phosphate molecules separate the G residue
from the first nucleotide in the chain (whereas only
1 P separates the other nucleotides).
• Guanylate is joined via a 5’-5’ linkage rather than
the std. 3’-5’ linkage which links nucleotides in a
growing chain.
• Cap protects RNA from degradation by 5’Æ3’
exonuclease activity.
POLYADENYLATION
• Cleavage at 3’ end of mRNA
• Addition of poly(A) tail at 3’end of cleaved mRNA
Poly(A) site
5’
AAUAAAA
CPSF
5’
CPSF:
Cleavage & Polyadenylation
Specificity Factor
CstF
AAUAAAA
5’
3’
G/U
3’
G/U
CPSF
CstF:
Cleavage stimulation factor
Endonucleolytic cleavage
CstF
3’
CPSF
5’
CstF
3’
AAUAAAA
5’
Poly(A) polymerase
5’
AAUAAAA
(A)200
G/U
Degradation
3’
3’
Role of polyadenylation
• To protect mRNA from degradation by exonucleases.
• Exonucleases ‘attack’ its free 3’ end and rapidly degrades mRNA.
• Appears to increase the efficiency by which an mRNA is
translated.
Not all mRNAs (encoding proteins) are polyadenylated,
e.g.mRNAs encoding Histones.
•
mRNA fold itself into a double-stranded stem-loop structure which
protects it from degradation.
EXONS & INTRONS
• Protein-coding regions of a gene are known as EXONS.
• Intervening regions that do not encode parts of protein are known as
INTRONS.
• Introns are transcribed into mRNA, but remains in nucleus.
• Hence, primary RNA transcript must have its introns removed before
being transported into the cytoplasm and translated.
• RNA SPLICING is the process whereby introns are removed.
RNA SPLICING – what’s the mechanism?
• Clue: Short, conserved sequences at splice junctions.
RNA SPLICING
5’ splice site
5’
3’ splice site
A
GU
Exon 1
AG
Intron
3’
Exon 2
Cleavage at 5’ splice site
5’
3’ 5’
Exon 1
5’
3’
U
G 5’
2’A
U
G 5’
2’A
3’
AG
Exon 2
Exon 1
3’
Cleavage at 3’ spliced site
AG
Intron
Exon 1
3’
Exon 2
Intron
Exon 1
5’
AG
Intron
Lariat formation
5’
A
GU
Exon 2
3’
5’
Exon 2
3’
3’
In vitro analysis (Ruskin et al., 1984)
Nuclear extract (from cells) incubated with radio-labelled RNA:
Starting RNA
Final spliced product
Excised intron
Small nuclear RNAs (snRNAs)
• These are splicing ‘factors’, i.e. assist in the splicing process.
• NOTE: they are NOT proteins, but RNA molecules !!!
• But snRNAs associate with small nuclear ribonucleoproteins (snRNPs)
to form a large ribonucleoprotein complex called a Spliceosome.
SPLICEOSOMES assembly
ALTERNATIVE SPLICING
• A mechanism for tissue-specific expression
• E.g. Hepatocytes generate fibronectin proteins that are different
from those produced by Fibroblasts.
RNA TRANSPORT
• Spliced mRNA must be transported out from the nucleus (across the
nuclear membrane) into the cytoplasm for translation into protein.
• Heterogeneous nuclear RNPs (hnRNPs) is likely to mediate this
transport by associating with mRNA in nucleus.
• In yeast, the Gle1 protein mediates this transport.
• The Gle1 protein contains a short nuclear export signal (NES)
sequence.
• NES sequence is also present in HIV’s Rev protein, which is
involved in regulating the nuclear-cytoplasmic transport of different
HIV mRNAs.
TRANSLATION
RIBOSOMES
• Translation takes place on defined cytoplasmic organelles called
RIBOSOMES.
ROLES OF RNA IN TRANSLATION
Three types of RNA molecules perform different but complementary roles in
protein synthesis (translation):
ƒ
Messenger RNA (mRNA) carries information copied from DNA in the form
of a series of three base “words” termed codons
ƒ
Transfer RNA (tRNA) deciphers the code and delivers the specified amino
acid
ƒ
Ribosomal RNA (rRNA) associates with a set of proteins to form
ribosomes, structures that function as protein-synthesizing machines
TRANSFER RNA (tRNA)
• tRNA forms the vital link between mRNA & the growing polypeptide chain.
• 50 different tRNAs in eukaryotes.
• But only 20 amino acids are designated by the genetic code.
Æ different tRNAs (isoacceptors) are specific for the same amino-acid (due to
‘wobble’ base-pairing).
• Nomenclature: e.g. tRNAGly1 and tRNAGly2 are both specific for glycine.
• Amino-acid is attached at 3’-end of tRNA.
• All mature tRNA ends with –CCA. (CCA added by tRNA nucleotidyl-transferase).
• Anti-codon base-pairs with CODON on mRNA during translation.
What is a Codon?
• A unit of 3 nucleotides.
• An amino acid is encoded by a Codon (except for stop codons).
TRANSFER RNA (tRNA)
Cloverleaf structure
(dihydrouridine)
(ThymidinepseudoU-cytidine)
3-D Structure
Aminoacylation (‘charging’) of tRNA
•
•
•
•
Attachment of amino-acid (a.a) to tRNA.
Enzymes req.d: aminoacyl-tRNA synthetases.
Each tRNA is recognised by a specific aminoacyl-tRNA synthetase.
Aminoacylation occurs in 2 steps.
Step 1:
• Formation of activated a.a.
intermediate;
• a.a linked to tRNA via highenergy bond.
Step 2:
a.a transferred to 3’-end of tRNA.
Overall rxn:
enz
a.a + ATP + tRNA Æ
Aminoacyl-tRNA + AMP + 2Pi
Step 1 Æ
Step 2 Æ
Codon & tRNA anticodon recognition
• Specificity of aminoacylation Æ ensures tRNA carries the right a.a. denoted by
the codon the tRNA pairs with.
‘Wobble’ base-pairing occurs
‘Wobble’ results in non-standard base pairing:
• G-U pairing acceptable.
• Inosine (I), [a modified version of Guanosine], can pair with A, C and U.
‘Wobble’ base-pairing
G-U base-pairing
Enables the 4 codons for alanine to be decoded by just
2 tRNAs.
Inosine base-pairs
with A, C and U
Enables the 3 codons for isoleucine to be decoded by
just one tRNA.
THE GENETIC CODE
TRANSLATION INITIATION
In prokaryotes including bacteria,
ƒ Translation is initiated when the small ribosome subunit + initiation
factor (IF3) binds to Shine-Dalgarno seq. (5’-AGGAGGU-3’)
ƒ This seq. is 3-10 nucleotides upstream of the initiation codon (start
site).
ƒ Initiator tRNA is ‘charged’ with N-formylmethionine or methionine.
In eukaryotes,
ƒ Ribosome binds to the 5’ end of mRNA by recognizing the
methylated cap.
ƒ Ribosome moves along mRNA until it encounters AUG within Kozak
seq (5’-ACCAUGG-3’) Æ initiation of translation.
ƒ Initiator tRNA is ‘charged’ with methionine.
TRANSLATION INITIATION (Eukaryotes)
Cap
5’
(A)n 3’
AUG
(A)n 3’
AUG
eIF4E
eIF4A
eIF4F
complex
tRNA
eIF4G
eIF4E
eIF2
40S
(A)n 3’
AUG
tRNA
eIF4F
complex
eIF4A
eIF2
40S
eIF4G
eIF4E
(A)n 3’
AUG
eIF2
60S
AUG
40S
(A)n 3’
Translation
ELONGATION OF TRANSLATION
• Mechanism very similar in bacteria and eukaryotes.
Peptide bond formation catalysed
by peptidyl transferase
eEF-1
Animation clip
EXERCISES
Exercise 1a:
5’- GTAGCCTACCCATAGG -3’
If mRNA is transcribed from this DNA using the complementary
strand as a template, what will be the seq. of the mRNA?
5’ – GUAGCCUACCCAUAGG - 3’
What peptide will be made if translation started exactly at the 5’ end of
this mRNA? (assume no start codon is req.d).
Valine(V) – Alanine(A) – Tyrosine(Y) – Proline(P)
Exercise 1b:
5’ – GUAGCCUACCCAUAGG - 3’
Potentially, how many different peptides are encoded in this mRNA?
3 different peptides, since there are 3 different reading frames.
5’ – GUAGCCUACCCAUAGG - 3’
V A Y P *
(Frame 1)
* P T H R
(Frame 2)
S L P I
(Frame 3)
Six peptides … if the stretch of DNA (in Exercise 1a) is also
transcribed in the opposite direction, i.e. both strands serving as
templates for transcription.
Exercise 2:
If the anti-codon of a tRNA has this sequence:
5’- G C U –3’
Which could be the likely corresponding codon sequence on the mRNA?
(1) 5’- C G A –3’
(2) 5’- A G C –3’
(3) 5’- C G T –3’
(2) and (4)
(4) 5’- A G U –3’
Which amino acid is the tRNA likely to be specific for?
Serine
Locating genes by scanning Open Reading
Frame (ORF)
Human Interleukin-2 (IL-2) gene
- promoter, exon 1 and partial cds (Accession Number: AF031845)
1
61
121
181
241
301
361
421
481
tatgacaaag
aaaactgttt
ctaatgtaac
aaattccaaa
gtctttgaaa
attaacagta
cacagtaacc
aagtcttgca
acaactggag
aaaattttct
catacagaag
aaagagggat
gagtcatcag
atatgtgtaa
taaattgcat
tcaactcctg
cttgtcacaa
catttactgc
gagttacttt
gcgttaattg
ttcacctaca
aagaggaaaa
tatgtaaaac
ctcttgttca
ccacaatgta
acagtgcacc
tgtatcccca
catgaattag
tccattcagt
atgaaggtaa
attttgacac
agagttccct
caggatgcaa
tacttcaagt
cccccttaaa
agctatcacc
cagtctttgg
tgttttttca
ccccataata
atcactcttt
ctcctgtctt
tctacaaaga
gaaaggagga
taagtgtggg
gggtttaaag
gactggtaaa
tttttccaga
aatcactact
gcattgcact
aaacacagct
What is the sequence of amino acids encoded by this piece of DNA?
But first, we need to know where the translational start site is.
There are eight possible initiation codons – which is the one?
Important info: also need to know where the transcription start site is.
Human Interleukin-2 (IL-2) gene, exon 1
In Eukaryotes, scanning Open Reading
Frame (ORF) is complicated by Introns
Effect of Point mutations
Effect of Deletion mutations