Download Protein Translation

Document related concepts

LSm wikipedia , lookup

QPNC-PAGE wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Non-coding DNA wikipedia , lookup

Magnesium transporter wikipedia , lookup

RNA silencing wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Peptide synthesis wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Cell-penetrating peptide wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

List of types of proteins wikipedia , lookup

Protein moonlighting wikipedia , lookup

SR protein wikipedia , lookup

RNA-Seq wikipedia , lookup

Western blot wikipedia , lookup

Polyadenylation wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Bottromycin wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Protein (nutrient) wikipedia , lookup

Protein wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Protein adsorption wikipedia , lookup

Molecular evolution wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Non-coding RNA wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Point mutation wikipedia , lookup

Gene expression wikipedia , lookup

Protein structure prediction wikipedia , lookup

Messenger RNA wikipedia , lookup

Biochemistry wikipedia , lookup

Ribosome wikipedia , lookup

Epitranscriptome wikipedia , lookup

Expanded genetic code wikipedia , lookup

Genetic code wikipedia , lookup

Transcript
Human Genetics
Translation of RNA into Protein
Central Dogma
Replication
DNA
Transcription
RNA
Translation
Nucleus
Protein
Cytoplasm
Human Genome
3.2 million DNA base pairs
1.5% encode proteins < = > 98.5% not protein encoding
~ 31,000 genes encoding 100,000 - 200,000 proteins
How are 100,000 to 200,000 proteins produced from
31,000 genes?
What is the 98.5% of the human genome that does not
encode proteins?
Noncoding portion of the human
genome
Type of sequence
Function or characteristic
Noncoding RNAs
Translation (tRNA,rRNA)
Pseudogenes
RNA processing
Introns
Removed with RNA processing
Promoters and other
regulatory regions
Determine when and where
transcription occurs
Repeats:
Transposons
DNA that moves around genome
Telomeres
Chromosome tips
Centromeres
Important for attachment to spindle
Duplications
Unknown
Simple short repeats
unknown
Two types of nucleic acids
RNA
DNA
Usually single-stranded
Usually double-stranded
Has uracil as a base
Has thymine as a base
Ribose as the sugar
Deoxyribose as the sugar
Carries protein-encoding
information
Carries RNA-encoding
information
Can be catalytic
Not catalytic
# of strands
kind of sugar
bases used
RNA Structure Depends on
Sequence
A can pair with U and the C with G via
hydrogen bonding just as with DNA.
Secondary RNA structure is critical in
how it performs its function.
RNA Structure and RNA Sequence
enable an RNA to interact specifically
with proteins.
RNA Processing
mRNA transcripts are modified before use as a
template for translation:
- Addition of capping nucleotide at the 5’ end
- Addition of polyA tail to 3’ end


Important for moving transcript out of nucleus
And for regulating when translation occurs
Splicing - the removing internal sequences
- introns are sequences removed
- exons are sequences remaining
RNA Processing
Protein Structure was solved
before DNA was known to be
genetic material
Linus Pauling and Alpha Helix led
to model building by Watson and
Crick
Proteins
most abundant type of molecules in cells
responsible for most biological functions
muscle contraction
oxygen transport
immune system
connective tissue
hair/skin
metabolism
- myosin and actin
- hemoglobin
-antibodies
- cartilage
- keratin
- enzymes
Gene Expression changes in
Proteins during Development
Protein Basics
Proteins are polymers assembled from amino
acids
20 different amino acids are used
Bond between amino acids is called the "Peptide
Bond".
Peptide Bond is formed between the carboxyl
group of one amino acid and the Alpha amino
group of another amino acid.
mRNAs have a 5' end and a 3' end - they have
Polarity.
Proteins also have polarity.
Protein Folding is Critical
How is protein folding directed within
cells?
This is still an active area of research, but
to a large degree, protein sequence
determines protein folding.
Misfolding of Protein Impairs
Function
Protein Polarity
The Amino acid at one end of a protein chain
has a free Alpha amino group.
Called "Amino-Terminus" or "N-terminus" of the
protein.
Amino acid at other end has a free Alpha
carboxyl group.
Called "Carboxy-Terminus" or "C-terminus" of
the protein.
Direction of Protein Synthesis is from Nterminus to C-terminus.
The Genetic Code
There is a 3 to 1 correspondence
between RNA nucleotides and amino
acids.
The three nucleotides used to encode
one amino acid are called a codon.
The genetic code refers to which codons
encode which amino acids.
How do we know it is a 3 letter code?
How Do the mRNA Nucleotides Direct
Formation of the Amino Acids in a Protein?
Proteins are formed from 20 amino acids in humans.
Codons of one nucleotide:
A
G
C
U
Can only encode
4 amino acids
Codons of two nucleotides:
AA GA CA UA
AG GG CG UG
AC GC CC UC
AU GU CU UU
Can only encode
16 amino acids
Codons of three nucleotides:
AAA
AAC
GAA
GAC
CAA
CAC
UAA
UAC
AGA ACA AUA AAG AGG ACG AUG
AGC ACC AUC AAU AGU ACU AUU
GGA GCA GUA GAG GGG GCG GUG
GGC GCC GUC GAU GGU GCU GUU
CGA CCA CUA CAG CGG CCG CUG
CGC CCC CUC CAU CGU CCU CUU
UGA UCA UUA UAG UGG UCG UUG
UGC UCC UUC UAU UGU UCU UUU
Allows for 64 potential codons => sufficient!
Theoretical Codes
The Genetic Code
Three Conceivable Kinds of Genetic Codes
Translation
 The process of reading the RNA sequence of an
mRNA and creating the amino acid sequence of a
protein is called translation.
DNA
template
DNA
Transcription
T
T
C
A
G
T
C
A
G
A
A
G
U
C
A
G
U
C
strand
Messenger
RNA
mRNA
Codon
Codon
Codon
Translation
Protein
Lysine
Serine
Valine
Polypeptide
(amino acid
sequence)
How do we know a 3 nucleotide
codon determines amino acid choice?
Prediction of Amino Acid Sequence
from Synthetic RNA molecules
The genetic code is nonoverlapping
Universal Code?
In some organisms, a few of the 64 possible
"words" of the genetic code are different.
Do a few different words mean that the code is
not universal?
Perhaps: if you're willing to say that the US and
Britain don't share a common language
because elevators in the UK are called "lifts"
and they spell the word "color" with a "u.“
The Genetic Code Is
Linear: uses mRNA which is
complementary to DNA sequence.
Triplet: the unit of information is the
codon, a series of three ribonucleotides.
Unambiguous: each codon specifies
only one amino acid (AA).
Degenerate: more than one codon exists
for most amino acids.
The Genetic Code Is:
Punctuated: there are codons that indicate
“start” and “stop.”
Commaless: there is no punctuation within a
mRNA sequence.
Nonoverlapping: any one ribonucleotide is
part of only one codon (some exceptions exist).
Universal: the same code is used by viruses,
bacteria, archaea, and eukaryotes.
Point Mutations
Single Base Change can alter protein
product.
Misssense: results in one amino acid
change.
Nonsense: results in stop codon.
Frame-shift: change "reading-frame" of
genetic message.
Silent mutations: point mutations that
DON’T alter the protein product because of
the degenerate nature of the genetic code.
Frame Shift
Within a gene, small deletions or insertions of a
number of bases not divisible by 3 will result in
a frame shift. For example, given the coding
sequence:
AGA UCG ACG UUA AGC
corresponding to the protein
arginine - serine - threonine - leucine - serine
Frame Shift
The insertion of a C-G base pair between
bases 6 and 7 would result in the following
new code, which would result in a nonfunctional protein. Every amino acid after the
insertion will be wrong.
AGA UCG CAC GUU AAG C
Corresponding to the protein:
arginine - serine - histidine - valine – lysine
The frame shift could generate a stop codon
which would prematurely end the protein.
How to Recognize Protein
Information in DNA
Don't assume that a dsDNA molecule will be
read from left to right on the top strand.
 Every dsDNA sequence has six possible
translations:
top / bottom strand each with a 1st / 2nd / 3rd
reading frame
Not every AUG or "stop" sequence is a start or
stop codon.
ORF is the Open Reading Frame- It has an
ATG in frame with a Stop codon. It could
encode a protein.
Comma free and nonoverlapping are correct.
The living cell does decodes the messenger
RNAs by a kind of dead-reckoning.
Ribosomes march along the messenger RNA
in strides of three bases, translating as they go.
Except for signals that mark where the
ribosome is supposed to start, there is nothing
in the code itself to enforce the correct reading
frame.
Three codons serve as stop signs: UAA, UAG
or UGA
What reading frame should be
used?
In any mRNA sequence, there are three
ways triplet codons can be read.
Each way to read the codons is called a
"Reading Frame".
It is very important for ribosome to find
correct reading frame.
If the wrong reading frame is used,
translation generates a protein with the
wrong amino acid sequence which is not
functional.
At what codon in the mRNA does
the ribosome begin translation?
Recall there is a 5’ untranslated region of
the messenger RNA.
The solution is that the ribosome begins
translation at a specific AUG codon within
the mRNA template termed the "Start
Codon".
This is a methionine codon, so the first
amino acid in proteins is almost always
methionine.
Translation has Three Steps
Initiation - translation begins at start codon
(AUG=methionine)
Elongation - the ribosome uses the tRNA
anticodon to match codons to amino acids and
adds those amino acids to the growing peptide
chain
Termination - translation ends at the stop codon
UAA, UAG or UGA
Translation Initiation
Translation Initiation
Leader
sequence
Small ribosomal subunit
5’
3’
mRNA
mRNA
U U C G U C A U G G G A U G U A A G C G A A
U A C
Assembling to
begin translation
Initiator tRNA
Met
Translation Initiation
Ribosome
5’
3’
mRNA
A U G G G A U G U A A G C G A
U A C C C U
tRNA
Amino acid
Met
Gly
Large ribosomal subunit
Translation Elongation
5’
3’
mRNA
A U G G G A U G U A A G C G A
U A C C C U
Met
Gly
Translation Elongation
5’
3’
mRNA
A U G G G A U G U A A G C G A
C C U A C A
Gly
Cys
Translation Elongation
5’
3’
mRNA
A U G G G A U G U A A G C G A
C C U A C A
Gly
Cys
Translation Elongation
5’
3’
mRNA
A U G G G A U G U A A G C G A
A C A U U C
Cys
Lengthening
polypeptide
(amino acid chain)
Lys
Translation Elongation
5’
3’
mRNA
A U G G G A U G U A A G C G A
A C A U U C
Cys
Lys
Translation Elongation
5’
3’
mRNA
A U G G G A U G U A A G C G A
A C A U U C
Cys
Lys
Translation Termination
Stop codon
5’
mRNA
A U G G G A U G U A A G C G A U A A
U U C
Lys
Release
factor
Translation Termination
Ribosome reaches stop codon
Stop codon
5’
mRNA
A U G G G A U G U A A G C G A U A A
Release
factor
Translation Termination
Once stop codon is reached,
elements disassemble.
Release
factor
Translation In the Cell
Multiple copies of a protein are
made simultaneously
5'- G T A A T C C T C -3' DNA sense (partner)
strand
3’- C A T T A G G A G -5’ DNA template (antisense)
strand
5'- G U A A U C C U C -3' mRNA
N - val - ile - leu - C
protein
By convention, amino acid sequences are
written and numbered left-to-right from Nterminus to C-terminus.
tRNA is a connection between anticodon
and amino acid
5'-AUG-3' codon in mRNA
|||
3'-UAC-5'anticodon in tRNA
5'-CAU-3'if anticodon is written 5’->3'
RNA Splicing Depends on
Sequence and Structure
http://bcs.whfreeman.com/thelifewire/content/chp14/1402001.html
Alternative
splicing of
exons forms
distinct
proteins:
one gene, many
proteins
Alternative splicing of exons
forms distinct proteins:
one gene, many proteins
Exon shuffling forms distinct
proteins: