Download genetic code

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Transfer RNA wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Nucleosome wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Transcription factor wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Frameshift mutation wikipedia , lookup

DNA vaccination wikipedia , lookup

Molecular cloning wikipedia , lookup

Human genome wikipedia , lookup

Expanded genetic code wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Genomics wikipedia , lookup

Epigenomics wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Microevolution wikipedia , lookup

DNA polymerase wikipedia , lookup

DNA supercoil wikipedia , lookup

RNA interference wikipedia , lookup

Nucleic acid double helix wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

History of genetic engineering wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Point mutation wikipedia , lookup

Gene wikipedia , lookup

Replisome wikipedia , lookup

RNA world wikipedia , lookup

Non-coding DNA wikipedia , lookup

Messenger RNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genetic code wikipedia , lookup

Polyadenylation wikipedia , lookup

RNA silencing wikipedia , lookup

Nucleic acid tertiary structure wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

RNA-Seq wikipedia , lookup

RNA wikipedia , lookup

History of RNA biology wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Epitranscriptome wikipedia , lookup

Non-coding RNA wikipedia , lookup

Primary transcript wikipedia , lookup

Transcript
‫بسم هللا الرحمن الرحیم‬
‫تهیه کننده ‪ :‬علی قنبری •‬
‫دانشجوی کارشناسی ارشد بیوتکنولوژی کشاورزی •‬
‫استاد راهنما‪ :‬جناب آقای دکتر باباییان •‬
‫‪Ghanbari‬‬
‫‪March 2006‬‬
Genetic Code
DNA
transcription
mRNA
translation
protein
genetic code: means for converting DNA sequence into protein sequence
the original question has always been how to convert 4 nucleotide bases
into 20 types of amino acids
in the 1940's Beadle and Tatum begain studying a bread mold Neurospora
and isolated mutants (ie. strains of yeast with damaged genes) that
could not grow when provided with minimal nutrients but survived OK
when complete, or rich nutrients were provided.
Beadle and Tatum identified many mutants for various products-- amino
acids, vitamins, etc.
March 2006
Ghanbari
Genetic Code
They already knew that there were multiple steps in synthesizing a
particular product; ie. perhaps 5 genes for synthesizing isoleucine
By substituting different intermediates into minimal growth conditions,
they could infer which steps (enzymes) were defective
A
W
B
X
C
Y
D
Z
E
Beadle and Tatum isolated various strains that required E for growth
By adding A, B, C, or D to minimal media they could guess which step
was defective.
for example, if one strain would grow when C, D, or E were added to
minimal media but NOT A or B, that means that enzyme Y can convert
C toMarch
D and
enzyme Z can convert
D to E. However, B cannot be made
2006
Ghanbari
into C, saying that the defect was in enzyme X.
Genetic Code
One gene, one enzyme hypothesis: each gene that they mutated coded
for exactly one single enzyme
so now there is a connection between mutations and enzyme function
We now know this is slightly off-- one gene codes for 1 protein
some enzymes have 2 or more proteins in them (ie. F-type ATPases, etc)
We also know it goes even further-- one gene codes for 1 protein and they
do not have to have an enzymatic function-- ie. actin, hemoglobin, etc.
None of these experiments, though, adressed the question of HOW a gene
codes for a protein (note that at this point in 1940's, DNA was just
determined to be the genetic material)
March 2006
Ghanbari
Genetic Code
4 nucleotides in DNA have to somehow code for 20 amino acids
1 nucleotide clearly not sufficient-- that gives on 4 amino acids
2 nucleotides is better, but not enough-- 42 gives 16 amino acids
3 nucleotides is the minimum-- 43 gives 64 possible amino acids, enough
early 1960's, Crick, Brenner and students used acridine dyes to generate
mutants defective for various enzymes
acridine dyes are a mutagen (chemical that causes mutations) that cause
addition or deletions of single base pairs of DNA
additional acridine dye treatments could sometimes return enzyme
function-- changes were additive-- multiple changes gave active enzyme
frameshift
mutation: change in DNA
sequence which alters the nucleotide
March 2006
Ghanbari
'letters' making up the amino acid 'words' of a protein
Genetic Code
Crick and Brenner showed that '+' mutants were cancelled by '-' mutants
Two '+' or two '-' mutants did not cancel
Three '+' or three '-' mutants WERE able to cancel out each other, just
like a '+' and a '-'
this suggested a 'triplet' code-- 3 nucleic acids per amino acid
'+' frameshift and '-' frameshift nearby gives mostly normal enzymes
two '+' or two '-' enzymes could not give a readable message
three '+' or '-' mutations near each other would add (+) or remove (-)
one amino acid, change a few others, and leave the rest of the protein
Crick and Brenner saw these reversions (returns to normal) frequently and
they knew there were 64 possible 3 nucleotide codes to make 20 amino
Ghanbari
acidsMarch 2006
Genetic Code
AUG GTC AAT AAA CCG...
met val asn lys pro
normal protein sequence
AUG TGT CAA TAA ACC G...
met cys gln OCR
one + mutation
AUG TTG TCA ATA AAC CG...
met phe ser ile asn
two (++) mutation
AUG TTT GTC AAT AAA CCG... three (+++) mutation
met phe val asn lys pro
note the sequence similarity
March 2006
Ghanbari
Genetic Code
degenerate code: one amino acid can be coded for by more than one
triplet code
ie: synonyms: two 'words' meaning same thing
Note that these arguments mean that the code is non-overlapping
an overlapping code would have nucleotides 1-3 coding for the first
amino acid, nucleotides 2-4 coding for the second amino acid, etc.
in an overlapping code, the '+' or '-' mutants could only change a few
amino acids-- all the others would be unaffected
there are a few cases (usually viruses) that have overlapping genes; ie.
genes that share different reading frames using the same nucleotides
almost always use opposite strands of DNA
March 2006
Ghanbari
Genetic Code
Nirenberg and Matthei: developed biochemical system outside of cells
to study protein synthesis
in their system, if they added RNA they would see more protein made
used an enzyme called polynucleotide phosphorylase to make RNA
sequence composed of only 1 type of base, either G, C, A, or U (not T!)
UTP
pnp
poly(U),
ATP
pnp
poly(A) etc
with poly(U) added to their cell free system, they saw more phenylalanine
incorporated into proteins
Reasoned that UUU coded for phenylalanine
showed
AAA
for proline, and GGG for glycine
March
2006 coded for lysine, CCC
Ghanbari
Genetic Code
Note that this brings up an issue-- DNA is double stranded
ie.
GACGTCTAG
CTGCAGATC
one strand will serve as the template-- strand that is used to direct the
synthesis of the RNA
ie. if GACGTCTAG is the template DNA, it would direct the synthesis
of CUGCAGAUC using the complimentary base pairs A:U and G:C,
the same rules as with base pairings within DNA
coding strand: DNA strand that is most similar to the synthesized RNA
March 2006
Ghanbari
Genetic Code
codon: 3 letter mRNA triplet 'read' by the protein synthesis machinery
bases are always read starting at the 5' phosphate toward the 3' end
(same order that nucleotide chains are made in)
5'-AUGUUUCGCAGA-3' mRNA (like the coding strand)
3'-TACAAAGCGTCT-5' DNA template strand
H. Gobind Khorana, instead of using polynucleotide phosphorylase,
synthesized RNAs with precise sequences
arranging various possible orders together, they could identify all codons
March 2006
Ghanbari
Genetic Code
March 2006
Ghanbari
Genetic Code
several special codons in the genetic code
AUG (in DNA represented by ATG) coding for methionine-initiator codon: starts the process of protein synthesis
3 termination codons (UAA, UAG, UGA) or stop codons: 3 base code
to end protein synthesis
the genetic code is unambiguous-- a 3 base codon always make the same
amino acid
wobble base: the third base pair in a codon can often be changed
without changing the protein sequence
Almost all organisms use the exact same code
a few exceptions that make a non-standard amino acid or use a special
March 2006
Ghanbari
transfer RNA (tRNA) that changes the meaning of a codon
Transcription in Prokaryotes
RNA polymerase: enzyme which synthesizes mRNA from the DNA
template strand using G, C, A, and U (uracil) as the bases
core enzyme of RNA polymerase is a tetramer with 2 a and 2 b subunits
holoenzyme: core RNA polymerase plus the sigma factor s
sigma factor recognizes sequences of DNA that precede coding DNA
promoter: regulatory sequence of DNA before the coding region of a gene
extremely important for regulating what genes are turned on
relatively simple in prokaryotes (discussed more in Chapter 23)
different sigma factors recognize different promoters
allows bacteria to turn on particular genes only when they're needed!
March 2006
Ghanbari
RNA polymerase (T7 Virus)
single stranded DNA
double stranded DNA
March 2006
Ghanbari
RNA polymerase (T7 Virus)
NoteMarch
the 2006
nice little hole for the single
Ghanbaristranded DNA to slide through
Transcription in Prokaryotes
4 steps of transcription: Binding, Initiation, Elongation, and Termination
transcription unit: segment of DNA that gives rise to a RNA molecule
1) RNA core enzyme (recognizing the s factor bound to a promoter)
binds to the DNA at that site
binding initiates unwinding of the DNA double helix
upstream: DNA 5' of the start of RNA transcription (ie. does NOT get
included in the RNA chain-- usually contains the promoter region)
downstream: DNA 3' of start of RNA transcription (included in the RNA)
promoter binding unwinds 15-18 bp of the DNA near where transcription
March 2006
Ghanbari
begins
DNA footprinting
promoters were originally identified by DNA footprinting
DNA footprinting: general technique for identifying sites on the DNA
that are bound by proteins
if a protein is bound to the DNA, a chemical or enzyme that breaks
phosphodiester bonds cannot reach the portion of DNA bound to
the protein-- that region is protected
you then randomly fragment the DNA region (having isolated it earlier
along with the protein of interest) and separate the pieces by
electrophoresis (remember-- phosphate is negatively charged so will
move in an electric field!)
regions
of DNA with less fragmentation
have proteins bound to them
March 2006
Ghanbari
March 2006
Ghanbari
DNA footprinting
March 2006
Ghanbari
Transcription in Prokaryotes
2) After DNA binds the s factor, RNA polymerase initiates transcription
recognizes s factor + the unwound DNA
NTPs (ie. ATP, CTP, GTP, or UTP) hydrogen bond to the template
strand of the DNA in the first 2 positions
RNA polymerase catalyzes formation of a phosphodiester bond between
first 2 nucleotides, joining the 3' hydroxyl of the first base to the 5'
phosphate of the second base
generates a phosphodiester bond and
inorganic phosphate (PP)
March 2006
Ghanbari
Transcription in Prokaryotes
RNA polymerase always starts at the 5' end and moves to the 3'
ie. new bases are added to the free 3' hydroxyl group of ribose
PP is lost from the newly added NTP
polymerase moves along, forming phosphodiester bonds as NTPs bind
after about 9 bp, s factor detaches from the RNA polymerase-- initiation
is complete
3) Elongation: RNA polymerase moves happily along the DNA
moves 5' to 3' -- NTPs bind to the 3' OH, giving off PP
DNA is unwound as the polymerase moves forward; winds back up after
it passes-- RNA doesn't form double helices as well (about 12 bp only)
2006 grows and exists on its
Ghanbari
RNAMarch
strand
own
Transcription in Prokaryotes
4) Termination: RNA polymerase stops adding bases
termination signal: sequence of DNA that makes RNA polymerase halt
2 types of termination signals
GC rich followed by several U's
GC rich region is complimentary to itself-- forms a hairpin
hairpin: nucleic acid structure that can base pair to itself
March 2006
Ghanbari
Transcription in Prokaryotes
rho (r) factor: protein that binds to a specific 50-90 bp sequence of RNA
rho binding unwinds RNA from the DNA template, essentially pulling
it away from the DNA and causes the RNA and polymerase to 'fall off'
the DNA
once the RNA polymerase core enzyme falls off DNA, can bind to a new
sigma factor and start the process again (and again, and again!) at the
same or different promoters
note that RNA polymerase is an ENZYME-- it isn't changed by making
the phosphodiester bonds
March 2006
Ghanbari
Transcription in Eukaryotes
follows the same 4 stages: binding, initiation, elongation, termination
because the organisms are more complex, so is transcription
instead of 1 RNA polymerase, there are now 3, each with different
characteristics
RNA polymerase I (RNApol I) makes ribosomal RNAs (rRNA)
RNApol II: synthesizes messenger RNA for protein coding
also makes small nuclear RNAs for mRNA processing
synthesizes broadest variety of RNAs
RNApol III: makes transfer RNA (tRNA) and other short RNAs
all 3March
are 2006
large multisubunit enzymes
(8-10 subunits) homologous to the
Ghanbari
prokaryotic ones
Transcription in Eukaryotes
3 different classes of polymerase-- therefore 3 classes of promoters
RNA pol I promoter has 2 parts
core promoter: minimal set of DNA bases to start rRNA synthesis
works, but is not very efficient
upstream control element or upstream activator, is upstream of the
core promoter, binds different proteins, and increases transcription
RNA pol I
DNA
March 2006
rRNA
Ghanbari
Transcription in Eukaryotes
RNA pol II promoter is the most complicated (because of the diversity
of RNA it needs to make)
1) short Initiator region (Inr) at the transcription start point
2) TATA box (A-T rich region) about 25 bp upstream from the Inr
3) TFIIB recognition element (BRE) immediately upstream from TATA
4) downstream promoter element (DPE): 30 bp downstream from Inr
Not every promoter has to have all 4 elements
must have either TATA or DPE, but can have both
like RNApol I promoters, it has upstream control elements as well
March 2006
Ghanbari
Transcription in Eukaryotes
RNA pol II
DPE
DNA
mRNA
TATA box
TFIIB (BRE)
Initiator (Inr)
core promoter (diagrammed above) gives low levels of transcription
upstream elements regulate the level even further
nearby upstream elements are called proximal promoter elements
March 2006
Ghanbari
more distant upstream elements
are called enhancers or silencers
Transcription in Eukaryotes
RNApol III promoter is entirely downstream of the transcription start
contains two 10 bp sequences, box A and either box B (for tRNA)
or box C (for rRNA)
RNA pol III
DNA
box B or C
March 2006
tRNA
Ghanbari
box A
Transcription in Eukaryotes
transcription factor: protein that regulates the transcription of genes
general (basal) transcription factor: protein REQUIRED for transcription
often start with TF, like TFIIB
just like with the s factor in prokaryotes, proteins must bind promoters
next, other TF proteins recognize proteins bound to promoters
RNA polymerase recognizes the cluster of TF and DNA binding proteins
notice the building up of a machine by protein- protein inteactions!
this is called the pre-initiation complex: RNApol is bound, but not
making RNA
March 2006
Ghanbari
Transcription in Eukaryotes
once the pre-initiation complex is formed, 2 more TF factors are needed
TFIIE binds and causes RNA polymerase to be phosphorylated
TFIIH binds the polymerase and acts as a helicase-- unwinds the DNA
so that the phosphorylated RNA polymerase can make RNA
Elongation is very similar-- RNA polymerase uses A:U, T:A, G:C, C:G
base pairing to make the RNA chain from NTPs giving off PP
uses the 3' OH from the message to the 5' phosphate of the NTP
One additional complication: RNA polymerase has to have proteins that
unwind nucleosomes-- ie. bacteria don't have them
eukaryotic polymerases have 8-10 proteins, bacteria only 4
some of these subunits recruit proteins to unwind nucleosomes
March 2006
Ghanbari
Transcription in Eukaryotes
Termination is usually caused by recognition of one of several
sequences in the DNA-- different polymerases recognize different
termination sequences
ie. RNApol I stops when a protein binds a particular 18 bp sequence
in the RNA
RNApol III stops when it encounters 6-8 uracils, etc
unlike prokaryotes, eukaryotic polymerases don't seem to stop at hairpins
March 2006
Ghanbari
RNA Processing
newly made RNA molecule is called a primary transcript-- copied
directly from the DNA
before it can serve its eventual function, RNA must be processed
RNA processing includes being cleaved at specific locations, chemical
modification of some nucleotides, nucleotides being added, etc.
modifications are usually dependent upon their eventual function
ie. transfer RNAs will have different modifications than mRNAs
just like in transcription, eukaryotes have more complex RNA processing
March 2006
Ghanbari
Ribosomal RNA Processing
70-80% of the total RNA in a cell is ribosomal RNA (rRNA)
March 2006
Ghanbari
Ribosomal RNA Processing
4 different rRNAs distinguished by their sedimentation coefficients
(only 3 in eukaryotes)
3 of the rRNAs are made by RNApol I as a single primary transcript
RNApol I is active in the
nucleolus, the large dense
spot in the nucleus
transcribed spacers are the
parts of the primary transcript
which separate the rRNAs
genome contains multiple
copies of the rRNA primary
transcription
unit-- needs to
March 2006
make a lot of rRNA!
Ghanbari
Ribosomal RNA Processing
transcribed spacers are cut out and then degraded
methyl groups are also added to ribose hydroxyls and some bases
snoRNAs (small nucleolar RNAs): RNAs that bind to particular
complimentary regions of rRNAs and which also bind to proteins
that methylate the rRNAs (note the use of complimentary base
pairing to direct these modifications!)
methylation of the rRNA reduces its degradation-- enzyme active
sites don't recognize it because it doesn't have the hydroxyl groups
just like ATP provides the phosphate group for phosphorylation
reactions, S-adenosyl methionine provides the methyl group
Ghanbariacid) and methionine (amino acid)
NoteMarch
the2006
fusion of adenosine (nucleic
Ribosomal RNA Processing
nearly HALF of the rRNA primary transcript is transcribed spacer
that gets deleted
as the rRNA gets processed, it associates with various proteins and
eventually becomes the large and small ribosomal subunits
Ribosomes are therefore made in the nucleolus
ribosomes also include one rRNA transcribed by RNApol III
this RNA, like the RNApol I transcript, has multiple copies arrayed in
tandem-- many copies in the same direction, one right after the other
the RNApol III transcript has few if any modifications
March 2006
Ghanbari
in genetics, we talk about mechanisms how these tandem arrays formed
Transfer RNA Processing
like the rRNA, tRNA requires extensive removal, addition and
modification of the nucleotides
tRNA: RNA molecules that bind to particular amino acids on one end
and recognize one of the 61 coding codons on the other
these are the ESSENTIAL bridge between nucleic acids and proteins
tRNAs are only about 70-90 nucleotides long and have several hairpin
loops (with complimentary base pairings holding them together) to
form a cloverleaf structure-- in 3D is really more L shaped
like the rRNAs, tRNA is synthesized as a precursor or pre-tRNA and
processed extensively
all tRNAs have the sequence CCA at the 3' end-- some naturally, in
March 2006
Ghanbari
others
it is added later
Transfer RNA Processing
March 2006
Ghanbari
Messenger RNA Processing
prokaryotic mRNA needs little or no processing-- it's ready to go
Ribosomes can associate with prokaryotic mRNA even as it is being
transcribed-- no barrier between mRNA synthesis and translation
March 2006
Ghanbari
Messenger RNA Processing
Eukaryotes require extensive processing of their mRNAs
at the 5' end (ie. first synthesized part-- start of the transcript) have a
5' cap, 7-methylguanosine
is made 'backwards'-- 5' to 5' linkage to
the initial triphospate base
added early after transcription
aids in stability and positioning
of the transcript for translation
NOT added by RNA polymerase!
March 2006
Ghanbari
Messenger RNA Processing
At the 3' end, most mRNAs contain a 'polyA tail'-- 50-250 adenosines
added by a specific enzyme, polyA polymerase
a signal sequence in the mRNA directs first the cleavage and then the
addition of the polyA tail 10-35 nucleotides downstream
polyA tail helps to protect the mRNA from exonucleases and
therefore increases its useful lifespan
polyA is recognized by transport proteins to send it out of the nucleus
can also be used by researchers in the lab to purify specifically mRNA
using a polyT oligonucleotide
March 2006
Ghanbari
Messenger RNA Processing
introns: sequences in the primary mRNA transcript that do not appear
in the mature mRNA
introns get cut out of the pre-mRNA and the mRNA gets ligated back
together
exons: regions of the pre-mRNA or DNA sequence that appear in proteins
introns are present in most protein coding genes
usually found nowadays by comparing mRNA sequence to genomic DNA
RNA splicing: enzymatic process of removing the introns
spliceosome: RNA protein complex that carries out RNA splicing
March 2006
Ghanbari
Messenger RNA Processing
most introns start with a 5' GU sequence and end with a 3' AG
introns must also contain an internal sequence called the branch point
snRNPs: small protein+ RNA complexes that make up the spliceosome
snRNPs bind in 3 parts: U1 binds to the 5' splice site
U2 binds to the branch point
U4/U6/U5 brings the ends of the intron together
a spliceosome contains 5 RNAs and 50+ proteins-- as big as a ribosome!
U4
U1
U2
March 2006
U4 U5
U6
U6
U6
U4
U1 U5
U2
Ghanbari
U5
U1
U2
Messenger RNA Processing
Once snRNPs form a spliceosome, the 5' end is covalently joined to
an adenine nucleotide in the branch point in a structure called a lariat
once that intermediate is formed, the 3' end is cleaved, the 2 ends of the
exons are joined together, and the lariat RNA is sent for degradation
splicing occurs during transcription-- doesn't require pre-processing
we had mentioned ribozymes as RNA catalysts
first ribozymes were self-splicing intron sequences
March 2006
Ghanbari
Messenger RNA Processing
Some introns are NOT degraded after excision
some are involved in rRNA methylation
others can regulate mRNA translation by complimentary binding to
similar sequences (mRNAs)
in other proteins, introns may be left in or taken out
alternative splicing: decision to leave in or take out an intron
gives one gene the ability to make a number of related proteins using
different combinations of potential introns
March 2006
Ghanbari
Messenger RNA Processing
starting with a pre-mRNA
alternative splicing could yield:
or
or
or
or
etc.
each
alternatively spliced transcript
would code for proteins that
March 2006
Ghanbari
have overlapping regions but may have different functions, locations
mRNA Metabolism
most mRNAs aren't around for long-- they are degraded very rapidly
hated by molecular biologists-- mRNA is hard to work with
half-life: average length of time it takes for 1/2 of the mRNAs to be
degraded
mRNA instability allows the cell to regulate gene expression
mRNA also amplifies a DNA sequence
one gene can make many mRNAs, each making many proteins
allows the cell to control protein levels by controlling how many mRNAs
are made from that gene
2006
some March
promoters
are strong, others Ghanbari
weak, others only active sometimes
RNA Viruses
RNA viruses use RNA as their primary genetic material-- ie. HIV
these viruses have a very special enzyme called reverse transcriptase
which can make a DNA copy from the RNA
The DNA copy can then integrate into the host genome where it then
makes RNA that codes for its proteins and its own genetic material
for new virus particles
Molecular biologists use reverse transcriptase to make 'copy' DNA or
cDNA (DNA made from messenger RNA)
allows scientists to study exactly what mRNA transcripts get made
without having to understand what's happening with all the splicing,
regulatory DNA, repetitive DNA, etc in the genome
March 2006
Ghanbari
RNA Viruses
Reverse transcriptase is different than cellular enzymes-- it uses RNA
to make DNA
because it has a different active site, some nucleotide analogs can be
used to inhibit the reverse transcriptase active site (competitive
transition state analogs) and make up the most effective anti-AIDS drugs
March 2006
Ghanbari