Download Dr Sonia MM-702 course lectures_15th Jan 14_For Online

Document related concepts

DNA repair wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

DNA replication wikipedia , lookup

DNA nanotechnology wikipedia , lookup

Helicase wikipedia , lookup

Homologous recombination wikipedia , lookup

Microsatellite wikipedia , lookup

DNA polymerase wikipedia , lookup

Replisome wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
Basic Processes of Molecular
Biology
Core Course # MM 702
Dr. Sonia Siddiqui
Dr. Panjwani Centre For Molecular
Medicine and Drug Research (PCMD)
Meselson-Stahl experiment
DNA replication is Semiconservative
Bidirectional DNA replication begins at an Origin
DNA Replication and
Recombination
Synthesis of DNA molecule: in 3 steps
1- Initiation
2- Elongation
3-Termination
These processes required many different types of enzymes
1- DNA replicase system or replisome
2- Helicases
3- Topoisomerases
4- Primase
5- DNA ligases
Initiation
E.Coli DNA replication origin called OriC containing 246 bp.
For replication specific sequencing are present which is recognized by the enzymes
involved in the initiation:
1- 9 bp sequences on which DnaA protein binds
• DnaA-binding sites (I sites), IHF (Integration host factor) and FIS (factor for inversion
stimulation).
2- 13 bp rich A=T sequences on which DNA unwinding element (DUE)
IHF
FIS
Initiation
• DnaA protein is a member of the AAA+
ATPase protein family
Function: formation of oligomers and
hydrolyze ATP (do things slowly)
• 8 DnaA protein molecules in
ATP bound form makes a
helical complex
emcompassing the R and I
sites in oriC.
• It has high affinity towards R
sites than I
• It binds to R sites in ATP or
ADP-bound form whereas It
binds to I sites in only ATPbound form
Initiation
How DnaC docks DnaB protein?
• Hexamer of each DnaC subunit bound with ATP binds
with hexameric ring-shaped DnaB helicase.
• This interaction of DnaB-DnaC opens DnaB ring, further
interaction required DnaA
• 2 out of 6 hexamer of DnaB are loaded on DUE on to
each strand.
• DnaC+ATP hydrolyzed, releasing DnaC and DnaB bound
to the DNA.
Key step in replication: DnaB helicase
docking on the DNA
DnaB unwinds the DNA from 5’→ 3’ of single
stranded DNA, both strands moves in opposite
direction.
• This DNA with DnaB helicase has two
replication forks
• DNA polymerase III holoenzyme is linked via
epsilon subunits
• Many other single stranded DNA-binding
protein (SSB) are involved that binds on each of
the DNA strand at the fork
• Simultaneously DNA gyrase
or DNA
topoisomerase II relieves the tension in the DNA
molecule at the fork
The oriC DNA is methylated by Dam methylase at N6 of adenine 5’ GATC
region (palindromic sequence)
• Completion of DNA replication the oriC region of DNA is methylated but the
newly strand is not
• The hemimethylated oriC sequences are now ready to interact with the plasma
membrane with the help of a protein called SeqA
• OriC is released from the plasma membrane and SeqA is dissociates and DNA
is fully methylated by Dam methylase
Elongation of the DNA: Leading and Lagging strand synthesis
Leading strand synthesis:
• It begins with the synthesis by primase RNA
primer (DnaG,10-60 nucleotide) at the fork
• DnaG + DnaB helicase, primer synthesis takes
place opposite in the direction of helicase
movement
• DnaB helicase moves along the DNA strand, the
lagging strand
• dNTs keep adding to the DNA strand by DNA
polymerases III + DnaB complex moving on the
opposite strand
Lagging strand: Okazaki fragments
• On the other hand, Lagging strand synthesis
starts by the formation of okazaki fragments
replication direction is always from 5’- 3’
• Primase synthesize RNA primer and DNA
polymerase III + DnaB adds dNTs to the lagging
strand like in leading strand
Clamp loading complex of DNA polymerase III
• It contains two subunits along with the subunits
along with the ….. Subunits + AAA + ATPase
• This whole complex binds to ATP and the new β
sliding clamp
• This creates a stretch on the dimeric clamp,
opening up the ring at one subunit interface
• Lagging strand slipped into the ring via breaking
• Clamp loader hydrolyzes ATP, releasing the β
sliding clamp
Okazaki fragments:
Okazaki fragments synthesis complexity:
• DNA polymerase III forms a dimer around both the
strands bringing the strand close together
• DnaB + DnaG complex forms at the replication fork
called Replisome
• DNA polymerase III has two sets of core subunits
one synthesize the leading strand while the other
synthesize the Okazaki fragments on the lagging
strand
• It is noted that at the Primosome there is β sliding
clamp complex is present which is prepared by DNA
polymerase III
DNA ligation by Ligases
DNA Ligases
•
Ligase enzyme catalyzes the formation of a phosphodiester bond between a 3’
hydroxyl at the end of one DNA strand and a 5’ phosphate at the end of another
strand
•
Via adenylation the phosphate can be activated
Properties of DNA ligase
•
It is isolated from viruses and eukaryotes use ATP however, DNA ligases from
bacteria are different
a) Many DNA ligase use NAD+ a cofactor that normally functions in hydride transfer
reactions, a source of the AMP activating group
b)
DNA ligase can also be very useful in DNA recombination experiments
Replication in Eukaryotes
Cell –Cycle control System and Activated Protein Kinases
• Cyclin dependent kinases (cdks) a protein kinases which actually regulates
major events of cell cycle such as DNA replication, mitosis and cytokinesis
• In crease cdks levels during and at the beginning of mitosis leads to the
increase phosphorylation of proteins that controls chromosome condensation,
nuclear envelop breakdown and spindle assembly
• However cdks activity is control by many complexes and proteins such as
cyclins, cyclin activating kinases (CAK), cdk inhibitor protein (CKI), SCF, and
Anaphase-promoting complex (APC), cdk25 and wee1
Cyclins-cdks complex
• cdks require cyclins for their activation
• Cyclins are synthesized and degraded in each cell cycle
• cdks level remain normal through out the cell cycle, however changes in
the levels of cyclins causes the assembly of cyclin-cdk complexes- leads to
the activation and triggering of the cell-cycle events
• There are four classes of cyclins G1/S-cyclins, S-cyclins, M-cyclins and
G1 cyclins
• Mode of activation is each complex phosphorylate the target substrate
proteins and can change the activity of activation according the levels of
substrate that changes during or after the cell cycle
• CAK activates the cyclin-cdk complex by phosphorylating an a.a near the
cdk active site—which eventually activates the target protein and induce sp
cell-cycle activity
Regulation of cyclin-cdk complex
• The activity of the complex can be inhibited by phosphorylation via Wee1, a
protein kinase and activation can be done by a phosphatases which
dephosphorylate the complex via cdc25
• The activity of the complex can be regulated by another kinases cdk
inhibitor proteins (CKIs), which controls mainly S and G 1 phases . Upon
binding conformational changes takes place and makes it inactive
Cyclical proteolysis and cell-cycle control system
• The rate limiting step in cyclin destruction is the final ubiquitin-transfer
reaction performed by 2 ubiquitin ligases, APC complex and SCF
• SCF in S and G1 phase ubiquitinate the complex G1/S-cyclins and
certain CKI that are involve in S phase initiation
• However M phase is controlled by APC complex, it proteolyzed and
ubiquitinites cyclins and other proteins involve in M phase
• SCF activity is constant throughout the cell cycle, however APC levels
changes with cell-cycle stages
Cell –Cycle control and Transcriptional Regulation
• In more complex cell cycle, cyclins are controlled not only by there levels but
by controlling at the gene transcription level and its synthesis.
Intracellular control of cell cycle events
• The maintenance of each phase of cell-cycle that is G phase fusing with
S phase fusing with G1 phase fusing with M phase and then G phase
again, requires highly skilled and accuracy and constant adding of
activating substrates to maintain the smooth overlap of the phases at
different stages of cell-cycle
• For eg cdc6 a regulator protein, its level increases only in G1 phase
where it is required to bind with a complex with closely related proteins,
minichromosomal maintenance proteins (Mcm) , resulting in the formation
of a large pre-replicative complex or pre-RC complex
Intracellular control of cell cycle events
• The activation of the S-cdk in late G1 initiates DNA replication, another
kinases phosphorylate the Pre-Rc complex
• S-Cdk helps cdc6 protein to dissociate from ORC after an origin is fired--- this
leads to the disassembly of pre-RC which prevents replication from occurring
again at the same origin
• Secondly It prevents cdc6 and Mcm proteins from reassembling at any origin
• It phosphorylates the cdc6, and triggers the ubiquitinylation by the SCF
protein
• S-Cdk also phoshorylates certain Mcm proteins which triggers their export
from the nulceus, further proving that Mcm complex cannot bind to the
replication origin
• At the end all Cdk levels becomes zero, this dephosphorylate the cdc6 and
Mcm proteins allow pre-Rc complex assembly to occur once again
Replication in Eukaryotes Cells
Cyclin dependent kinases (CDKs) regulation control over DNA replication
• The cyclins destruction by Ubiquiton-dependent proteolysis at the end of M phase
• In the absence of CDKs the pre-replicative complexes (pre-RCs) can be formed on
replication sites
• In fast growing cells, this pre-RCs complex forms at the end M phase. Pre-RCs are called
licensing
• In eukaryotes the replication started by the formation of a mini chromosome maintenance
(MCM) proteins
• Many diff. types of MCM proteins exits like MCM2-MCM7 helicase also resembles like
DnaB helicase, loads on ORC along CDC6 (cell division cycle) and CDT1 (Cell division
transcript 1)
• Replication requires the S phase, cyclin-cyclin dependent kinase complexes and CDC7DBF4
• For replication both complexes must be together and the phosphorylating proteins on the
pre-RCs complex
Control of replication achieved by inhibiting the synthesis of more complexes
by CDK2 and other cyclins
Termination
• Ter sequence trap the replication fork
• Ter is for protein Tus (terminus utilization substance)
binding
• Ter-Tus complex works per replication cycle upon
collision of either fork
• Ter prevent over replication by replication fork and halts
upon collision of other fork
• The sequences that comes in between Ter-Tus will be
replicated only , making catenane circular chromosomes
Mechanism of DNA Repair: Mismatch Repair
Early steps of methyl-directed mismatched repair
• MutL + MutS complex at 5’ GATC binds to
all mismatched base pairs
• MutH + MutS binds to GATC
• MutL + MutS complex creates a loop on
DNA at both sides
• MutH has specific endonuclease activity
cleaves unmethylated GATC seq.
• MutH cleaves only G at 5’ side of GATC seq.
Finishing of methyl-directed mismatched repair
When the mismatching is
at 5’
• Ummethylated strand is
degraded in 3’-5’
• This requires many enzymes
•1-DNAhelicase II
•2- SSB
•3- Exonuclease I OR X
•4- DNA polymerase III
•5-DNA ligase
When the mismatching is
at 3’
• Exonuclease will be either VII
(for degradation in to 3’-5’ or 5’3’) OR RecJ nuclease
(degrades sDNA in 5’-3’)
Mechanism of DNA Repair: Base Excision Repair
• DNA glycosylases recognize the
AP and abasic sites (generated by
the cleavage of adenine and
cytosine deamination)
• Uracil DNA glycosylase removes
uracil only from DNA
• Enzyme recognize thymidine base
from Uracil in DNA ie why DNA has
thymidine and not uracil
Base-excision repair pathway
• Humans have 4 types of DNA glycosylase with
different specificities
• Humans also has hSMUG1 which also removes U
• TDG and MBD4 removes U or T present with G
• Other DNA glycosylase recognize and removes
formamidoprymidine and 8-hydroxyguanine (arised
from purine deamination)
• It also removes hypoxanthine and alkylated bases
like 3-methyladenine and 7-methylguanine
Mechanism of DNA Repair: Nucleotide-Excision Repair
5th
8th
22nd
6th
Excinucleases DNA repair in E. Coli
• Enzymatic complex ABC
excinuclease (can create two
cleavages)
•Subunits:
1- UvrA Mr 104,000
2- UvrB Mr 78,000
3- UvrC Mr 68,000
Eukaryotic excinucleases DNA repair system : DNA damages caused by
cigarette smoke can be repair by this repair mechanism
• Repair mechanism like nucleotideexcision repair and base-excision repair
is tied to transcription in eukaryotes
• This pathway helps to repair DNA from
various carcinogens like benzo[ά]
pyrene-guanine, cyclobutane pyrimidine
dimers and 6-4 photoproducts
Mechanism of DNA Repair: Direct Repair
Direct Repair: Pyrimidine dimers by photolyases
Direct Repair: Damage caused by alkylating agents on nucleotide
• O6-methylguanine forms in the presence
of alkylating agents
• This makes pairs with thymine instead of
cytosine leading mismatched A-T and C-G
bonds
• Repairment is achieved by O6methylguanine-DNAmethyltransferase
• This enzyme transfer a methyl group of
O6-methylguanine to one of its own Cys
residues
Direct Repair: Damage caused by alkylating agents on nucleotide
Direct repair: Alkylated bases by AlkB
• 1-methyladenine and 3-methylcytosine is
repaired by ά-ketoglutarate-Fe2+ -dependent
dioxygenase superfamily
• In this repair A and C residues which
sometimes becomes methylated in ssDNA,
which affects correct base pairing
• In E. coli, oxidation demethylation of these
bases is mediated by AlkB protein, a member
of this enzyme superfamily
Consequences of Replication fork + DNA damage
• Lesion in dsDNA and ssDNA appears when the damaged DNA didn’t find complementary strand for
the correct synthesis or when a replication fork encounters unrepaired DNA lesion
Error-prone translesion DNA synthesis:
• The DNA repair under this pathway is less accurate
with high mutation
• In bacteria this pathway is ON only when there is a
•continuous damage to the cell’s DNA (oxidation or stress)
like SOS response
• The production of normally present proteins UvrA and UvrB
Increases
• Other proteins UmuC and UmuD activated
• UmuD protein regulated by SOS response and cleaved
in to UmuD’
• UmuD’+ UmuC complex to form a specialized DNA polymerase V, helps in replication
• Still difficult to make base pairing, hence can have many chances of error
Genes Induced as part of the SOS response in E.coli
Consequences of Replication fork + DNA damage
• Desperate strategy from a cell to start the
synthesis of UmuC and UmuD initiated by a
SOS response resulting in the activation of
DNA polymerase V is a deliterious. Many
daughter cell dies due to the activation of this
type of repair mechanism
• Continuous degradation of the DNA molecule
also activates RecA protein that binds ssDNA
on one chromosomal location and binds with
DNA polymerase V at distant sites.
Consequences of Replication fork + DNA damage
• DNA polymerase η (eta) found in all eukaryotes and initiates TLS
primary β, iota and λ have specialized role in base- excision repair
• These enzymes also have 5’-deoxyribose PO4 lyase activity
• After the removal of base by glycosylase and PO4 group by AP
endonuclease, Polymerase removes the abasic site (5’ PO4) and fill
in the short gap
• This leads to the reduction in DNA polymerase η activity due to the
short length of DNA
You tube Links- DNA Repair
http://www.youtube.com/watch?v=kp0esidDr-c&feature=related
http://www.youtube.com/watch?v=nPS2jBq1k48&feature=related
http://www.youtube.com/watch?v=nPS2jBq1k48&feature=related
http://www.youtube.com/watch?v=y16w-CGAa0Y&feature=related
http://www.youtube.com/watch?v=y16w-CGAa0Y&feature=related
http://www.youtube.com/watch?v=nUzyrBC0tTY
http://www.youtube.com/watch?v=idbGJsDXDFo&NR=1
DNA Recombination
• Homologous Genetic Recombination: Involves genetic exchange between two
molecules DNA having similar sequences
• Site-specific recombination: Exchange occurs only at particular sequence on a DNA
• DNA transposition: Short segment of DNA in which chromosome moves from one
location to another
Homologous Genetic Recombination: Base-pairing between two
homologous DNA molecule
• Meiosis characteristics
• Two Different chromosome from two
homologous DNA cross over= DNA break and
ends join to their opposite partners to re-form
two intact helices
Both of these helices contains half and half part
of both the DNA.
• The site of cross over or the exchange of the
part of DNA molecules can occur anywhere in
the entire DNA having homologous nt
sequences in both DNA molecules
Homologous Genetic Recombination: Base-pairing between two
homologous DNA molecule
• This type of recombination occurs when a long region of nt sequences on both the
strands are in a match
• The point at which the cross over occur is called DNA synapsis
Qs arises that how both the strands recognize the site to start cross over ???
Homologous Genetic Recombination: Meiotic Recombination by dsDNA breaks
• The break in PO4 diester bond attracts the
other DNA helix to form base pairing thus
forms a synapsis
• It is thought that these strands search base
pairing on another DNA strand having
matching or homologous sequences
• Leading to the formation of a point or joint
between maternal and paternal chromosome
Homologous Genetic Recombination: Meiotic Recombination by dsDNA breaks
Qs How the synthesis of ds homologous DNA molecule starts to begin
the DNA synapsis
Homologous Genetic Recombination: DNA hybridization reactions model
• When a double helix DNA re-forms from a ssDNA. This is also
called DNA renaturation or hybridization
• This step follows a quickly zipping up the DNA molecule base
pairing to the maximum
• Annealing is required bc the DNA is in unfolded form
• Some times ssDNA strand folds back on itself for the base
pairing like a short hairpin
• This is critical condition for the cells i.e. why Single-Strand
binding protein is required
Homologous Genetic Recombination: RecA and its Homologs
• RecA has multiple DNA binding sites and catalyzes multistep
synapsin formation
• Before this the homology between ssDNA and the region in dsDNA
strand is identified by making transient base pairing
• Once synapsis starts, short heteroduplex regions have begun to
make base pairing to the longer distances via process called branch
migration
• This branch point can occur at any point where two single DNA
strands with the same sequences are attempting to pair with the
same complementary strand
• RecA is DNA-dependent ATPase, with ATP hyrolyzing site
• RecA tightly binds with DNA+ATP rather than DNA+ADP
Homologous Genetic Recombination: RecA and its Homologs
• ATP continuously added to one end of the RecA protein filaments, while ATP
hydrolyzes to ADP
• Therefore DNA share some dynamics of cytoskeleton filaments actin or tubulin
Homologous Genetic Recombination: Holliday Junction
• Holiday junction contains two of four dsDNA strand
that are crossing and forming the base pairs
• An Holliday junction produces an open,
symmetrical structure
• Further isomerization can interconvert the crossing
and non crossing strands, producing a structure that
is otherwise the same when it starts
• The formation of holliday junction requires ATP
hydrolysis used by sets of proteins
Homologous Genetic Recombination: Gene Conversion
Gene Conversion:
• DNA sequence information is transferred from one
DNA helix to another DNA helix whose sequence is now
altered due to the transfer
• It could happen by a homologous recombination
process that juxtaposes two homologous dsDNA helices
OR
• Short piece of DNA synthesis occur for the new allele
base pairing
Homologous Genetic Recombination: Gene Conversion
• Simply a heteroduplex joint forms in which both DNA helices have different nt sequences and are
not matched
• Unmatched sequences are removed by DNA repair mechanism, resulting in the formation of an
extra copy of DNA sequence on the opposite strand
• Then the same gene conversion process occur without crossover
Homologous Genetic Recombination: Outcomes in Meiosis and Mitosis
• Either outcome in general recombination
1- The DNA synthesis involved convert some of the genetic information at the site of the double
stranded break to that of the homologous chromosome
2- If these regions represent different alleles of the same gene
3- Then the nucleotide sequence in the broken helix is converted to that of unbroken helix,
resulting a gene conversion
Homologous Genetic Recombination: Prevention of Promiscuous
Recombination b/w two poorly matched DNA sequences
• Mismatch proofreading system normally
recognizes the mispaired bases in an initial
strand exchange
• These type of mechanism protect bacteria
or cells from invading foreign DNA to form
base pairs with the host DNA
Site-Specific Recombination
• It can alter gene order and also add new information to the
genome
• In this recombination the genes position changes within
one chromosomes OR to another chromosomes
• In this specialized nt sequences moves on a
nonhomologous sites within a genome. These movable
sequences are called mobile genetic elements
• These elements differ in size and ranges from 100s or
10,000s nt base pairs
• These elements are present in all cells from E coli. to
humans
Site-Specific Recombination
• Importance of this type of recombination is to produce many genetic variants on which
evolution depends
Reason:
These movable variants can also alter the adjacent host cells genome DNA sequences,
by carry them to another site
Site-Specific Recombination: Transpositional or Conservative movement
• Tranpositional Recombination :
1- Site-specific recombination requires enzymes and specific DNA sites
2- Does not involve the formation of heteroduplex DNA
3- It involves the ends of the broken DNA segments in chromosome + these ends should be attached at
one of many different nonhomologous target DNA sites
• Conservative Site-Specific Recombination
1- This involves the formation of a short heteroduplex joint
2- Due to this it requires short same DNA sequences in donor and recipient DNA molecule
DNA Transposones: Capable of injecting mobile genetic
elements in to any DNA sequences
• Transposons, are moveable genetic elements capable of injecting themselves in to many DNA
sites
• Transposase enzyme is encoded by transposon
Mechanism of action: This enzyme first loosen the transposons from the DNA and then insert it
in to the new DNA sites. Homology b/w the site and the end of the DNA is not an issue
• Usually they move very rarely and this is why they are difficult to detect
• They can be divided in to three classes
1- DNA-only transposons
2- Retroviral-like retrotransposons
3- Nonretroviral retrotransposons
DNA Transposons
Functions and specificities of DNA Transposons
1- DNA-only transposons
In this the mobile element DNA is cut out from the donor DNA and joined in the target
DNA by transposase. Hence exists as a DNA all life
2- Retroviral-like retrotransposons
They don’t move directly rather they need RNA polymerase to transcribe the mobile
element sequence into RNA. Then RNA transcriptase synthesize DNA from this RNA
using it as a template. Then this DNA will incorporate in to new DNA sites of the
target DNA, by an enzyme integrase.
3- Nonretroviral retrotransposons
It also requires RNA transcriptase to convert RNA in to DNA. But here RNA is directly
involved in the transposition reaction
DNA-only Transposons
DNA-only Transposons
Transpositional Site-specific Recombinant: Viruses
mode of incorporation in to the host DNA
Retroviral-like Retrotransposons Resemble Retroviruses, but Lack a
Protein Coat
Nonretroviral Retrotransposons
• Most of the human DNA is composed of LINE element (Long interspersed nuclear element)
• Although most of the copies of L1 element are immobile, a few retain the ability to move
• Some times movement of these elements causes a disease for eg in Hemophilia, L1 insertion in to the
gene encoding blood clotting factor VIII
• Nonretroviral retrotransposons can also be found in Yeast mitochondria, mammals and insects
• They move via the help of endonuclease complex and reverse transcriptase
• Other DNA repeats that lacks either endonucleases or reverse transcriptase in their nt sequences uses
cell’s endonucleases and reverse transciptase, including L1 elements
• For eg Alu elements lacks endonucleases or reverse transcriptase genes, still it has amplified and
becomes the major part of the human genome
• Alu and L1 genes sequences are closely related to the mouse sequences, but their incorporation in mouse
sequences is different than in humans
Reversible rearrangement of DNA = Conservative Site-specific recombination
Conservative Site-specific recombination: An example
• Bacteriophage lambda virus infects a bacterial
DNA and synthesize an encoded enzyme integrase
• Viral DNA covalently joins with the bacterial (host)
chromosome, and becomes a part of host DNA
and replicates
• Integrase enzyme act so by recognizing special
site on the host as well as on the viral DNA for the
joining of the two strand
Conservative Site-specific recombination: An example
• To reverse this link between two strands the same
mechanism can be use to excise the DNA
• By getting specific signals from the cells lambda
virus DNA jumps and leaves the sites on
chromosome and multiply rapidly in the bacteria
• This excision is catalyze by excisionase
The life cycle of bacteriophage Lambda
Conservative Site-Specific Recombination Can be Used to Turn Genes On or Off
RNA Transcription
• RNA polymerase binds to the bacterial DNA in spe region known as promoter
• The polymerase, using its σ factor recognizes this DNA seq by making specific
contacts with the portions of the bases that are exposed on the outside of the helix
• After RNA polymerase binds tightly to the promoter DNA in this way, it opens up
the double helix to expose short stretches of nt on each strand
• Instead of DNA helicase, here the nick does not require the hydrolysis of ATP,
both DNA and polymerase structurally changes themselves in a more energetically
favor state.
RNA polymerases in Eukaryotic cells
• RNA polymerase I transcribed 5.8 S, 18S, and 28S rRNA genes
• RNA polymerase II transcribed all protein-coding genes,plus snoRNA
genes and some snRNA genes
• RNA polymerase III transcribed tRNA genes, 5S rRNA genes, some
snRNA genes and genes for other small RNAs
Transcription In Eukaryotes: RNA polymerase II
Initiation of the RNA transcription by RNA polymerase II
• All the transcription factors assembled on the
promoter region which is recognized by the RNA
ployII
• For transcription initiation complex, TFIID
creates a big loop in the DNA strand so that all
the factors can join and assembled at the
promoter region for the protein assembly steps.
• TFIIH has a kinase activity as well as a helicase
property
• The termination of RNA is carried out TFIIH
which adds phosphate groups to the tail of the
RNA polymerase known as CTD or C-terminal
domain
RNA processing (Post transcriptional modification)
RNA Capping
• The 5’ end of the new RNA molecule is modified by the addition of a
cap contains a modified Guanine nucleotide
• The capping reaction is catalyzed by
1- Phosphatase which removes a PO4 from 5’ end of the RNA
2- a guanyl transferase that adds GMP in a reversal linkage 5’-5’
instead to 5’-3’
3- Methyl transferase that adds a methyl group to the guanosine
mRNA molecule
5’ Capping of the mRNA
RNA factory
Removal of Introns from the newly synthesized Pre- mRNA
• Eukaryotes genes were found in many
coding sequence known as expressed or
axon and intervening sequences or introns
• However both are transcribed in to RNA
• In the two sequential phosphoryltransfer
reaction or transesterfications join the exons
while removing the introns as a “lariat”
RNA Splicing Maschinery
• It consists of 5 additional RNA
mols. + > 50 proteins + requires
many ATP mols/ splicing
• This event should be highly
accurate, any mistake in splicing
could harm or kill the cells
• Importance
of
introns=
To
produce new types of proteins +
helps in genetic recombination to
combine the axons of different
genes + with the same genes
Splicing positions on Introns
• Introns can be 10 nt to 100,000 nt long
• Picking the correct place for their removal
is not easy
• This is done at 3 positions
1- 5’ splice site
2- 3’ splice site
3- Branch point at the middle of the intron
(excised lariat)
• These all positions have similar
consensus nt sequences which mark
splicing positions
• Still difficult to remove all
Spliceosome
• RNA molecules are involved instead of
proteins for splicing
• Short RNA mols (200 nt) and named U1,
U2, U4, U5, U6
• They all form a complex called as snRNAs
(small nuclear RNAs)
• snRNAs work with 7 protein subunits,
snRNP (small nuclear ribonulceo protein)
• snRNAs + snRNPs forms the core of
spliceosome
mRNA splicing mechanism
RNA-RNA rearrangements
Proper splice sites in Pre-mRNA
• Soon after the transcription and the 5’ cap formation
several functions of spliceosome acts on the
PO4rylated tail of the RNA polymerase
1- Pre-mRNA coming from RNA polymerase keeps
the track of intron and exons
• The 5’ snRNP with only one 3’splice site
•This helps to prevent wrong exon skipping
2- Exon definition hypothesis
• RNA synthesis proceeds with SR proteins act like a
component of spliceosome
• They mark 3’ and 5’ splice site starting from 5’ end
of RNA
• This involves U1 snRNA, mark the exon boundary
and U2AF, help to specify other
snRNPs splice a small fraction of Intron sequences
• A small set of snRNPs that direct spliceosome
recognizes only specific set of DNA sequences, AT-AC
spliceosome
• Another variation of splicing mechanism exits called as
trans-splicing
Trypanosomes produce all their mRNA in this way,
however few nematode mRNA are produce by
transplicing
RNA splicing and Plasticity
• When a mutation occurs in a nucleotide seq critical for
splicing of a particular intron, it did not splice that intron
• The exon will be skipped
• New pattern of splicing comes inaction and creates a
cryptic junctions and picks out the best pattern of splice
junctions
• If the present one got mutated it will seek out a new
one having best pattern
RNA-Processing Enzymes Generate the 3’ end of Eucaryotic mRNAs
• RNA polymerase continues its movement along gene,
the spliceosome on the RNA and cut the intron and
axon boundaries
• The long C-terminal tail of the RNA polymerase make
sure that all the components for the splicing should be
present on the RNA
•CstF (Cleavage stimulation factor) and CPSF (cleavage
and polyadenylation specificity factor) travel with the
RNApolymerase II and transferred to 3’ end processing
seq on an RNA mol emerges from the enzyme
Polyadenylation
RNA-Processing Enzymes Generate the 3’ end of Eucaryotic mRNAs
• Subunits of CPSF are associated with transcription factor
TFIID
• During transcription initiation these subunits moves on to RNA
polymerase tail
• Once RNA mols comes from the polymerase it will
accompanied by the binding proteins to form 3’endof mRNA. 1st
RNA cleaved by this
• Next the enzyme called poly-A polymerase adds, one time 200
nt A nt to the 3’end to create a cut
• 5’ and 3’ end has been formed by ATP mol
RNA-Processing Enzymes Generate the 3’ end of Eucaryotic mRNAs
• Two theories for RNA polymerase processivity loss
1- transfer of 3’ end processing factors from RNA polymerase to RNA
causes a conformational changes in the polymerase
2- Lack of 5’ cap of the RNA that arises form polymerase might signals
to eh enzyme to terminate transcription
Transport of Eukaryote mRNA from the Nucleus
Nuclear Pore Complex
1- They are aqueous channels in the nuclear membranes that directly
connect nucleoplasm and cytosol
2- Small mols < 50,000 daltons can diffuse freely through them
3- Macromols signals cells for import and exportpolymerase or mRNA
respectively
Export-ready mRNA molecule
Transport of Eukaryote mRNA from the Nucleus
Nuclear Pore Complex
1- They are aqueous channels in the nuclear membranes that directly connect
nucleoplasm and cytosol
2- Small mols < 50,000 daltons can diffuse freely through them
3- Macromols signals cells for import and exportpolymerase or mRNA respectively
4- Only useful RNA exported out via pores
5- hnRNPs (heterogenous nuclear ribonuclear proteins, 30 of them in humans)
are present in abundant on the pre-mRNA
6- Some are useful in removing hairpin helices from the RNA
Transport of Eukaryote mRNA from the Nucleus
7- Besides histones, hnRNP proteins are most abundant in the cell nucleus
8- These proteins help to distinguished b/w mature and unprocessed mRNA
Noncoding RNAs synthesis: rRNA
• Few % of dry cell weight is RNA
•The most abundant RNA in the cell is rRNA ie
80%
• 3-5 % mRNA
• RNA polymerase I (structually similar to II)
produces rRNA
• RNA polymerase I have no C-terminal ie why
they are neither capped or polyadenylated
• Ribosome are final gene products and a growing
cell must synthesize approx. 10 million copies of
each type of rRNA in each cell generation to
construct its 10 million ribosomes
Noncoding RNAs synthesis: rRNA
Eukaryotic rRNAs
• 4 types of rRNAs, one on each copy of ribosome
• 3 of four are 18S, 5.8S and 28S
•5.8S is synthesized from a separate cluster of
genes by a different polymerase
• Many chemical modifications occur in the 13,000nucleotide-long precursor r RNA before the rRNA
are cleaved out of it and assembled into ribosomes
• Chemical modifications includes 100 methylations
of the 2’OH positions on nt sugars and 100
isomerizations of uridine nt to pseudouridine
Noncoding RNAs synthesis: rRNA
Modifications:
• It is made at specific position in the prescursor rRNA
• These positions are specified by several hundred “guide
RNAs” which locate themselves via bp to the precursor
rRNA thereby move RNA-modifying enzyme to the
appropriate position
• Other guide RNAs promote cleavage of the precursor
rRNAs in to the mature rRNAs probably by causing
conformational changes in the precurosor rRNA
• They all belong to theRNA class of small nucleolar RNAs
(snoRNAs)
Nucleolus: A Ribosomal-Producing Factory
• Nucleolus is the rRNA processing site and their
assembly into ribosome
• It is large aggregate of macromolecules, including
the rRNA genes themselves, precursor RNAs,
mature rRNAs, rRNA processing enzymes,
snoRNPs, ribosomal protein subunits and partly
assembled ribosomes
• An eg: The U6 snRNA is chemically modified by
snoRNAs in the nucleolous before its final assembly
there into the U6 snRNP
• Telomerase and signal recognition particle are
said to be synthesized in nucleolus
• Nucleolus size changes with the type of cells
The Nucleus
• Contains GEMS (Gemini of coiled bodies) and
speckles (interchromatin granule clusters)
• GEMS and speckles are paired in the nucleus
•This is the site where snRNAs and snoRNAs
undergo their final modifications and assemble with
proteins
• They are the Cajal/GEMS site where snRNPs are
recycled and their RNAs are “reset” after
rearrangements that occur during splicing of premRNA
The Nucleus
• Interchromatin granule clusters have been proposed
to be stock piles of fully mature snRNPs that are ready
to be used in splicing of pre-mRNA
• GEMS contain SMN (survival motor neurons)
proteins. Mutations in GEMS gene encoding for this
protein can cause the spinal muscular atrophy
• In this disease subtle defects in snRNPs assembly
and subsequent splicing of pre-mRNA takes place
• RNA splicing take place at various location in
chromosome bc splicing is co-transcriptional
The Nucleus
• But when the same region become transcriptionally active, they relocate
towards the interior of the nucleus, which is richer in the components required
for mRNA synthesis
• Mammalian cells express 15,000 genes so transcription and RNA splicing
must take place at several thousands sites in the nucleus
The genetic Codon
Translation: Clover Leaf tRNA
Unusual nucleotides found in the tRNA
Amino acid Activation
Aminoacyl-tRNA linkage
Translation of the genetic code
Amino acid incorporation
Hydrolytic RNA Editing
Comparison of the structures of procaryotic and eucaryotic ribosomes
A Ribosome and binding sites
RNA-binding sites in the ribosome
Translating an mRNA molecule
Elongation of a polypeptide chain
Mechanism of Peptidyl transferase activity present in
the large ribosomal subunit
Initiation phase of protein synthesis in
eucaryotes
Final phase of protein synthesis
A polyribosome
The translational frameshift that produces the reverse transcriptase and
integrase of a retrovirus
Inhibitors of Protein or RNA Synthesis
Inhibitor Specific Effect
Acting Only on Procaryotes*
Tetracycline blocks binding of aminoacyl-tRNA to A-site of ribosome
Streptomycin prevents the transition from initiation complex to chain-elongating ribosome and also causes
miscoding
Chloramphenicol blocks the peptidyl transferase reaction on ribosomes
Erythromycin blocks the translocation reaction on ribosomes
Rifamycin blocks initiation of RNA chains by binding to RNA polymerase (prevents RNA synthesis)
Acting on Procaryotes and Eucaryotes
Puromycin causes the premature release of nascent polypeptide chains by its addition to growing chain end
Actinomycin D binds to DNA and blocks the movement of RNA polymerase (prevents RNA synthesis)
Acting Only on Eucaryotes
Cycloheximide blocks the translocation reaction on ribosomes
Anisomycin blocks the peptidyl transferase reaction on ribosomes
a-Amanitin blocks mRNA synthesis by binding preferentially to RNA polymerase II
*. The ribosomes of eucaryotic mitochondria (and chloroplasts) often resemble those of procaryotes in their
sensitivity to
inhibitors. Therefore, some of these antibiotics can have a deleterious effect on human mitochondria.
Protein folding
Hsp70 family of molecular chaperones
Hydrophobic regions of a protein: Protein Quality control
Ubiquitin regulated degradation of the proteins
Triggering sister chromatid separation
Retroviruses
Retroviruses are infectious particles consisting of an RNA
genome packaged in a protein capsid, surrounded by a lipid
envelope. This lipid envelope contains polypeptide chains
including receptor binding proteins which link to the
membrane receptors of the host cell, initiating the process of
infection.
Retroviruses contain RNA as the hereditary material in place
of the more common DNA. In addition to RNA, retrovirus
particles also contain the enzyme reverse transcriptase (or
RTase), which causes synthesis of a complementary DNA
molecule (cDNA) using virus RNA as a template.
When a retrovirus infects a cell, it injects its RNA into the
cytoplasm of that cell along with the reverse transcriptase
enzyme. The cDNA produced from the RNA template
contains the virally derived genetic instructions and allows
infection of the host cell to proceed.
The virus that causes AIDS (acquired immune deficiency
syndrome) is a retrovirus. It is called HIV for human
immunodeficiency virus.
Exogenous
The following genera are included here:
• Genus Alpharetrovirus; type species: Avian leukosis
virus; others include Rous sarcoma virus
• Genus Betaretrovirus; type species: Mouse mammary
tumor virus
• Genus Gammaretrovirus; type species: Murine
leukemia virus; others include Feline Leukemia virus
• Genus Deltaretrovirus; type species: Bovine leukemia
virus; others include the cancer-causing Human Tlymphotropic virus
• Genus Epsilonretrovirus; type species: Wall eye
dermal sarcoma virus
• Genus Lentivirus; type species: Human
immunodeficiency virus 1: others include Simian, Feline
immunodeficiency viruses
• Genus Spumavirus; type species: Simian Foamy virus
These were previously divided into three subfamilies
Biogenesis of RTs
Retroviruses can be subdivided in to several groups on the basis of Genus
Some basic features shared by all retroviruses
• All grps carry the gag, pol and env coding domains
• Whereas, the complex retroviruses lentivirus and delta viruses carry additional genes encoding for
the regulatory proteins
• Reverse transcriptase sequences reside within the pol coding domain. They share similar activities
in all types of viruses
• But these transcriptase can be differ in structure and subunit composition, molecular weights,
catalytic activity, biochemical, biophysical characteristics and sensitivity to different inhibitors
• All retroviruses, the viral pol gene encodes the Pol precursor protein that consists both reverse
transcriptase and integrase proteins
• Fused protein is synthesized as part of a larger Gag-Pro-Pol polyprotein, from which the mature
reverse transcriptase and integrase protein are cleaved out during virus assembly by the viral
protease
• The Gag and Gag-Pro-Pol polyproteins are translated from identical mRNA but in Gag-Pro-Pol
there is either a termination suppression or a frameshift by 1 nt. This regulates the ratio of Gag to
Gag –Pro-Pol synthesis
• In some retroviruses like lentivirus there is only 1 nt frameshift takes place in the same reading
frame which brings same equimolar ratios of reversetranscriptase, PR and Integrase. This ratio is 20
fold less than the Gag protein.
Some basic features of retroviruses
Alpharetrovirus (ASLV)
• There is 1 nt shift in the reading frame before the pol gene
• Due to this pro gene in the same reading frame as Gag, here Gag is synthesized as a part of the
Gag polyprotein
• In this the ratio of viral protease (encoded by the pro gene) to Pol is much higher compared to
lentiviruses
Mouse mammary tumor virus (MMTV, betaretrovirus)
• There is 2 consecutive frameshift take place in the reading frame.
• This type of shifting also occurs in T-cell leukemia virus-1 (HTLV-1), bovine leukemia virus
(BLV) (both are deltaretroviruses)
• In this pro gene in its own reading frame is in a 1 nt shift relative to gag and the pol gene is in a
1 nt shift relative to pro gene
• In epitomizing gammaretroviruses there is an entirely different independent-frameshift
arrangement takes place
• In this pol and pro genes are in the same reading frames as gag, but are separated by a stop
codon
Some basic features of retroviruses
• The synthesis of Pro and Pol proteins is a product of an in-frame read-through suppression of the
termination codon at the end of the gag gene.
• Frame shifting and read through have probably evolved as strategies to provide proper ratios of
Gag, Gag-Pro and Gag-Pro-Pol polypeptides in the infected cell
• Spumaviruses (foamy viruses) are very different from other retroviruses in which reverse
transcriptase is generated from a separate spliced mRNA instead of Gag-Pro-Pol polypeptide
precursor
• This also include infectous dsDNA particles, who are similar to full length cDNA
• Alpharetroviruses, such as ASLV β subunit has a fused reverse transcriptase-integrase
polypeptide while α subunit lacks integrase sequences having RNase H domain
• Gammaretroviruss or betaretroviruses have monomeric reverse transcriptase, containing
DNA polymerase and RNase H domains
• Their N-terminal contain in some residues upstream of viral protease protein
Biogenesis of Reverse Transcriptase
Retroviruses Practicality
Gene therapy
Gammaretroviral and lentiviral vectors for gene therapy have been developed that
mediate stable genetic modification of treated cells by chromosomal integration of the
transferred vector genomes.
Cancer
Retroviruses that cause tumor growth include Rous sarcoma virus and mouse mammary tumor
virus. Cancer can be triggered by proto-oncogenes that were mistakenly incorporated into
proviral DNA or by the disruption of cellular proto-oncogenes.
Retroviruses
• A retrovirus is an RNA virus that is replicated in a host cell via the enzyme reverse transcriptase
to produce DNA from its RNA genome
• The DNA is then incorporated into the host's genome by an integrase enzyme. The virus
thereafter replicates as part of the host cell's DNA
• Retroviruses are enveloped viruses that belong to the viral family Retroviridae
• The virus itself stores its nucleic acid in the form of a +mRNA (including the 5'cap and 3'PolyA
inside the virion) genome and serves as a means of delivery of that genome into cells it targets as
an obligate parasite, and constitutes the infection
• Once in the host's cell, the RNA strands undergo reverse transcription in the cytosol and are
integrated into the host's genome, at which point the retroviral DNA is referred to as a provirus. It
is difficult to detect the virus until it has infected the host
Retroviruses
Structure
• Virions of retroviruses consist of enveloped particles about 100 nm in diameter.
• The virions also contain two identical single-stranded RNA molecules 7-10 kilobases (kb) in
length.
• Although virions of different retroviruses do not have the same morphology or biology, all the
virion components are very similar
The main virion components are:
Envelope composed of a protein capsid, which is obtained from the host plasma membrane during
budding process.
RNA: consists of a dimer RNA. It has a cap at 5' end and polyadenyle at 3' end.
• The RNA genome also has terminal noncoding regions, which are important in replication, and
internal regions that encode virion proteins for gene expression.
• The 5' end includes four regions, which are R, U5, PBS, and L. R region is a short repeated
sequence at each end of the genome during the reverse transcription in order to ensure correct end-toend transfer in growing chain.
The main virion components
RNA:
•U5, on the other hand, is a short unique sequence between R and PBS. PBS (primer binding site)
consists of 18 bases complementary to 3' end of tRNA primer
• L region is an untranslated leader region that gives signal for packaging of genome RNA. The 3'
end includes 3 regions, which are PPT (polypurine tract), U3, and R
• PPT is primer for plus-strand DNA synthesis during reverse transcription
• U3 is a sequence between PPT and R, which has signal that provirus can use in transcription. R is
the terminal repeated sequence at 3' end.
Proteins: consisted of gag proteins, protease (PR), pol proteins and env proteins
• Gag proteins are major components of the viral capsid, which are about 2000-4000 copies per
virion
• Protease is expressed differently in different viruses, It functions in proteolytic cleavages during
virion maturation to make mature gag and pol proteins
• Pol proteins are responsible for synthesis of viral DNA and integration into host DNA after
infection, Finally, env proteins play role in association and entry of virion into the host cell
Multiplication
When retroviruses have integrated their own genome into the germ line, their genome is passed on
to a following generation
• These endogenous retroviruses (ERVs), contrasted with exogenous ones, now make up 5-8% of
the human genome
• Most insertions have no known function and are often referred to as “junk DNA“
• However, many endogenous retroviruses play important roles in host biology, such as control of
gene transcription, cell fusion during placental development in the course of the germination of an
embryo, and resistance to exogenous retroviral infection
• Endogenous retroviruses have also received special attention in the research of
immunology-related pathologies, such as autoimmune disease like multiple sclerosis, although
endogenous retroviruses have not yet been proven to play any causal role in this class of disease
Multiplication
• While transcription was classically thought to only occur from DNA to RNA, reverse
transcriptase transcribes RNA into DNA
• The term "retro" in retrovirus refers to this reversal (making DNA from RNA) of the
central dogma of molecular biology
• Reverse transcriptase activity outside of retroviruses has been found in almost all
eukaryotes, enabling the generation and insertion of new copies of retrotransposons into the
host genome
• It is important to note that a retrovirus must "bring" its own reverse transcriptase in its
capsid, otherwise it is unable to utilize the enzymes of the infected cell to carry out the task,
due to the unusual nature of producing DNA from RNA.
Multiplication
• Industrial drugs that are designed as protease and reverse transcriptase inhibitors can quickly
be proved ineffective because the gene sequences that code for the protease and the reverse
transcriptase can undergo many substitutions
•These substitutions of nitrogenous bases, which make up the DNA strand, can make either the protease
or the reverse transcriptase difficult to attack
•The amino acid substitution enables the enzymes to evade the drug regiments because mutations
in the gene sequences can cause physical or chemical change, which makes them harder to detect
by the drug
•When the drugs that are supposed to attack enzymes, such as protease, are designed, the manufacturers
target specific sites on the enzyme
•One way to attack these targets can be through hydrolysis of molecular bonds, which means that
the drug will add molecules of H2O (water) to specific bonds. By adding molecules of water at a
site on the virus, the drug breaks the previous bonds that were linked to each other
• If several of these breaks occur, the result can lead to lysis, the death of the virus.
Because reverse transcription lacks the usual proofreading of DNA replication, a retrovirus
mutates very often. This enables the virus to grow resistant to antiviral pharmaceuticals quickly,
and impedes the development of effective vaccines and inhibitors for the retrovirus.
Genes
Retrovirus genomes commonly contain these three open reading frames that encode for proteins
that can be found in the mature virus:
Group specific antigen (gag) codes for core and structural proteins of the virus;
Polymerase (pol) codes for reverse transcriptase, protease and integrase; and,
envelop (env) codes for the retroviral coat proteins.
Provirus
This DNA can be incorporated into host genome as a provirus that can be passed on to progeny
cells. The provirus DNA is inserted at random into the host genome. Because of this, it can be
inserted into oncogenes. In this way some retroviruses can convert normal cells into cancer
cells. Some provirus remains latent in the cell for a long period of time before it is activated by
the change in cell environment.
Development
Studies of retroviruses led to the first demonstrated synthesis of DNA from RNA templates, a
fundamental mode for transferring genetic material that occurs in both eukaryotes and
prokaryotes. It has been speculated that the RNA to DNA transcription processes used by
retroviruses may have first caused DNA to be used as genetic material. In this model, the RNA
worlds hypothesis, cellular organisms adopted the more chemically stable DNA when
retroviruses evolved to create DNA from the RNA templates. Retroviruses are proving to be
valuable research tools in molecular biology and have been used successfully in gene delivery
systems.
Endogenous
Main article: endogenous retrovirus
Endogenous retroviruses are not formally included in this classification system, and are broadly
classified into three classes, on the basis of relatedness to exogenous genera:
Class I are most similar to the gammaretroviruses
Class II are most similar to the betaretroviruses and alpharetroviruses
Class III are most similar to the spumaviruses
Group VI
All members of Group VI use virally encoded reverse transcriptase, an RNA-dependent DNA
polymerase, to produce DNA from the initial virion RNA genome. This DNA is often integrated
into the host genome, as in the case of retroviruses and pseudoviruses, where it is replicated and
transcribed by the host.
Group VI includes:
Family Metaviridae
Family Pseudoviridae
Family Retroviridae - Retroviruses, e.g. HIV
Group VII
Both families in Group VII have DNA genomes contained within the invading virus particles.
The DNA genome is transcribed into both mRNA, for use as a transcript in protein synthesis,
and pre-genomic RNA, for use as the template during genome replication. Virally encoded
reverse transcriptase uses the pre-genomic RNA as a template for the creation of genomic DNA.
Group VII includes:
Family Hepadnaviridae - e.g. Hepatitis B virus
Family Caulimoviridae - e.g. Cauliflower mosaic virus
Treatment
Main article: Antiretroviral drug
Antiretroviral drugs are medications for the treatment of infection by retroviruses, primarily
HIV. Different classes of antiretroviral drugs act at different stages of the HIV life cycle.
Combination of several (typically three or four) antiretroviral drugs is known as highly active
anti-retroviral therapy (HAART).
Treatment of Veterinary Retroviruses
Feline Leukemia Virus and Feline immunodeficiency virus infections are treated with biologics,
including Lymphocyte T-Cell Immune Modulator (LTCI)
Reverse Transcription
• RTN is initiated after the association of RT with genomic viral RNA and a primer. The
primer is required for copying the template RNA by all RTs
• However in most cases, DNA synthesis is initiated from a transfer RNA (tRNA) primer,
supplied by the host cell
• A tRNA lys3 is used by lentivruses and betaretroviruses
• A tRNA pro is used by gammaretroviruses and tRNA trp is used by alpharetroviruses
Process in class VI
viruses
Class VI viruses ssRNA-RT, also called the retroviruses are RNA reverse
transcribing viruses with a DNA intermediate. Their genomes consist of
two molecules of positive sense single stranded RNA with a 5’ cap and 3’
polyadenylated tail. Examples of retroviruses include Human
Immunodeficiency Virus (HIV) and Human T-Lymphotropic virus (HTLV).
Creation of double-stranded DNA occurs in the cytosol as a series of
steps
Reverse transcription
Process in class VI viruses
RT in HIV
Foamy virus
HIV Virus
Different types of retroviruses
Post Translational modifications
1- Most of the proteins that are translated from mRNA undergo chemical modifications
before becoming functional in different body cells
2- Expression of proteins is important in diseased conditions. Post translational
modifications play an important part in modifying the end product of expression and
contribute towards biological processes and diseased conditions. The amino terminal
sequences are removed by proteolytic cleavage when the proteins cross the membranes.
These amino terminal sequences target the proteins for transporting them to their actual
point of action in the cell.
Types of Protein Post Translational Modifications
1- Glycosylation: Many proteins, particularly in eukaryotic cells, are modified by the addition of
carbohydrates, a process called glycosylation. Glycosylation in proteins results in addition of a glycosyl
group to either asparagine, hydroxylysine, serine, or threonine.
2- Acetylation: the addition of an acetyl group, usually at the N-terminus of the protein.
3- Alkylation: The addition of an alkyl group (e.g. methyl, ethyl).
4- Methylation: The addition of a methyl group, usually at lysine or arginine residues.
5- Biotinylation: Acylation of conserved lysine residues with a biotin appendage.
6- Glutamylation: Covalent linkage of glutamic acid residues to tubulin and some other proteins.
7- Glycylation: Covalent linkage of one to more than 40 glycine residues to the tubulin C-terminal tail of the
amino acid sequence.
8- Isoprenylation: The addition of an isoprenoid group (e.g. farnesol and geranylgeraniol).
9- Lipoylation: The attachment of a lipoate functionality.
10- Phosphopantetheinylation: The addition of a 4'-phosphopantetheinyl moiety from coenzyme A, as in
fatty acid, polyketide, non-ribosomal peptide and leucine biosynthesis.
11- Phosphorylation: the addition of a phosphate group, usually to serine, tyrosine, threonine or histidine.
12- Sulfation: The addition of a sulfate group to a tyrosine.
13- Selenation: C-terminal amidation
Glycosylation
1- Glycosylation is the enzymatic process that links saccharides to produce glycans, attached to
proteins, lipids, or other organic molecules.
2- This enzymatic process produces one of the fundamental biopolymers found in cells (along with
DNA, RNA, and proteins).
3- Glycosylation is a form of co-translational and post-translational modification. Glycans serve a
variety of structural and functional roles in membrane and secreted proteins. The majority of proteins
synthesized in the rough ER undergo glycosylation.
4- It is an enzyme-directed site-specific process, as opposed to the non-enzymatic chemical reaction
of glycation.
Five classes of glycans are produced:
1- N-linked glycans attached to a nitrogen of asparagine or arginine side chains;
2- O-linked glycans attached to the hydroxy oxygen of serine, threonine, tyrosine, hydroxylysine, or
hydroxyproline side chains, or to oxygens on lipids such as ceramide;
3- phospho-glycans linked through the phosphate of a phospho-serine;
4- C-linked glycans, a rare form of glycosylation where a sugar is added to a carbon on a tryptophan
side chain;
5- glypiation, which is the addition of a GPI anchor that links proteins to lipids through glycan linkages.
Importance of Glycosylation
1- Some proteins do not fold correctly unless they are
glycosylated first
2- Polysaccharides linked at the amide nitrogen of
asparagine in the protein confer stability on some
secreted glycoproteins
3- Experiments have shown that glycosylation in this
case is not a strict requirement for proper folding, but
the unglycosylated protein degrades quickly.
4- Glycosylation may play a role in cell-cell adhesion
N-linked Glycosylation
1- N-linked glycosylation is important for the folding of some
eukaryotic proteins. The N-linked glycosylation process occurs
in eukaryote and widely in archaea, but very rarely in bacteria
2- Eukaryotes, most N-linked oligosaccharides begin with
addition of a 14-sugar precursor to the asparagine in the
polypeptide chain of the target protein. The structure of this
precursor is common to most eukaryote, and contains 3
glucose, 9 mannose, and 2 N-acetylglucosamine molecules
N-linked Glycosylation
5- Mature glycoproteins may contain a variety of oligomannose N-linked
oligosaccharides containing between 5 and 9 mannose residues.
6- Glucose linked to the guanidinium group of arginine in sweet corn
amyelogenin is the only reported example of N-linked glycosylation on an amino
acid other than asparagine.
O-linked glycosylation and O-N-acetylgalactosamine (O-GalNAc)
1- O-linked glycosylation occurs at a later stage during protein
processing, probably in the Golgi apparatus
2- This is the addition of N-acetyl-galactosamine to serine or
threonine residues by the enzyme UDP-N-acetyl-Dgalactosamine:polypeptide N-acetylgalactosaminyltransferase
(EC 2.4.1.41), followed by other carbohydrates (such as
galactose and sialic acid)
3- This process is important for certain types of proteins such
as proteoglycans, which involves the addition of
glycosaminoglycan chains to an initially unglycosylated
proteoglycan core protein
Importance: One function involves secretion to form
components of the extracellular matrix, adhering one cell to
another by interactions between the large sugar complexes of
proteoglycans. The other main function is to act as a
component of mucosal secretions, and it is the high
concentration of carbohydrates that tends to give mucus its
"slimy" feel.
O-fucose
1- O-fucose is added between the second and third conserved cysteines of EGF-like
repeats in the Notch protein.
2- In the case of EGF-like repeats, the O-fucose may be further elongated to a
tetrasaccharide by sequential addition of N-acetylglucosamine (GlcNAc), galactose, and
sialic acid
3- For Thrombospondin repeats, may be elongated to a disaccharide by the addition of
glucose.
O-glucose
1- O-glucose is added between the first and second conserved cysteines of EGF-like
repeats in the Notch protein, and possibly other substrates by protein:O-glucosyltransferase
(Poglut).
2- The O-glucose modification appears to be necessary for proper folding of the EGF-like
repeats of the Notch protein, and increases secretion of this receptor.
O-N-acetylglucosamine
Recently, O-GlcNAc was reported to occur between the fifth and sixth conserved cysteines in
some EGF-like repeats from the Notch protein
O-mannose
1- During O-mannosylation, a mannose residue is transferred from mannose-p-dolichol to a
serine/threonine residue in secretory pathway proteins.
2- O-mannosylation is common to both prokaryotes and eukaryotes
Collagen Glycosylation
1- Many lysines in collagen are hydroxylated to form hydroxylysine, and many of these
hydroxylysines are then glycosylated by the addition of galactose.
2- This galactose monosaccharide can then be further elongated by the addition of a
glucose.
3- This glycosylation is required for the proper functioning of collagen. Glycosylation of
hydroxlysine occurs in the ER.
Hydroxyproline Glycosylation
Proline is also hydroxylated in collagen, however, no glycosylation occurs here as the
hydroxyprolines are necessary for hydrogen bonding in the collagen triple helix.
Glycosylation of Glycogenin
Liver and muscle glycogenin carries a glucose on a tyrosine side chain. This is the only
known example of glycosylated tyrosine in nature.
Glycosylation of Ceramide
Either a galactose or a glucose can be added to a hydroxyl on the lipid ceramide. The
glucose can be further elongated to a disaccharide by the addition of a galactose.
Proteoglycans
The large and complex glycans that modify proteoglycans are initiated by addition of
xylose to serine.
C-mannosylation
1- A mannose sugar is added to the first tryptophan residue in the sequence W-X-XW (W indicates tryptophan, X is any amino acid).
2- Thrombospondins are one of the most commonly modified proteins, however this
form of glycosylation appears elsewhere as well.
3- This is an unusual modification because the sugar is linked to a carbon rather than
a reactive atom like a nitrogen or oxygen.
GPI Anchors (Glypiation)
A special form of glycosylation is the GPI anchor. This form of glycosylation functions
to attach a protein to a hydrophobic lipid anchor, via a glycan chain
O-N-acetylglucosamine (O-GlcNAc)
1- O-GlcNAc is added to serines or threonines by O-GlcNAc transferase.
2- O-GlcNAc appears to occur on most serines and threonines that would otherwise be
phosphorylated by serine/threonine kinases.
3- O-GlcNAc addition and removal also appears to be a key regulator of the pathways that
are disrupted in diabetes mellitus.
4- The gene encoding the O-GlcNAcase enzyme has been linked to non-insulin dependent
diabetes mellitus. It is the terminal step in a nutrient-sensing hexosamine signaling pathway.
Secreted and Membrane-Associated Proteins
1- Proteins that are membrane bound or are destined for excretion are synthesized by
ribosomes associated with the membranes of the endoplasmic reticulum (ER).
2- The ER associated with ribosomes is termed rough ER (RER). This class of proteins
all contain an N-terminus termed a signal sequence or signal peptide.
3- The signal peptide is usually 13-36 predominantly hydrophobic residues.
4- The signal peptide is recognized by a multi-protein complex termed the signal
recognition particle (SRP).
5- However, some proteins that are destined for secretion are also further proteolyzed
following secretion and, therefore contain pro sequences.
• Mechanism of synthesis of membrane bound or secreted proteins
Proteolytic Cleavage
1- Most proteins undergo proteolytic cleavage following translation.
2- The simplest form of this is the removal of the initiation methionine.
3- Many proteins are synthesized as inactive precursors that are activated under proper
physiological conditions by limited proteolysis.
4- Pancreatic enzymes and other enzymes involved in clotting are examples of the
latter.
5- Inactive precursor proteins that are activated by removal of polypeptides are termed
proproteins.
6- An example of a preproprotein is insulin. Since insulin is secreted from the pancreas it
has a prepeptide.
7- Following cleavage of the 24 amino acid signal peptide the protein folds into
proinsulin.
8- Proinsulin is further cleaved yielding active insulin which is composed of two peptide
chains linked together through disulfide bonds.
Acetylation
1- Many proteins are modified at their N-termini following synthesis.
2- In most cases the initiator methionine is hydrolyzed and an acetyl group
is added to the new N-terminal amino acid.
3- Acetyl-CoA is the acetyl donor for these reactions.
4- Some proteins have the 14 carbon myristoyl group added to their N-termini.
5- The donor for this modification is myristoyl-CoA.
6- This latter modification allows association of the modified protein with
membranes.
7- Histone acetylation occurs on the surface of the nucleosome core as
part of gene regulation when the histones are acetylated on lysine residues.
8- Acetylation brings in a negative charge that acts to neutralize the positive
charge on the histones,
9- This decreases the interaction of the N termini of histones with the negatively
charged phosphate groups of DNA.
Methylation
1- Post-translational methylation of proteins occurs on nitrogens, oxygens, Imidazole ring
of histidine, Guanadino moiety of Arginine, R-grp amides of Glutamate and Aspartate
2- The activated methyl donor is S-adenosylmethionine (SAM).
3- The most common methylations are on the ε-amine of lysine residues.
4- Methylation of lysine residues in histones in DNA is an important regulator of chromatin
structure and consequently of transcriptional activity.
5- Recent findings indicate that methylation of lysine residues affects gene expression not
only at the level of chromatin, but also by modifying transcription factors.
Phosphorylation
1- Post-translational phosphorylation is one of the most common protein modifications that
occurs in animal cells.
2- The vast majority of phosphorylations occur as a mechanism to regulate the biological
activity of a protein and as such are transient.
3- In other words a phosphate (or more than one in many cases) is added and later
removed.
4- Physiologically relevant examples are the phosphorylations that occur in glycogen
synthase and glycogen phosphorylase in hepatocytes in response to glucagon release from
the pancreas.
5- Phosphorylation of synthase inhibits its activity, whereas, the activity of phosphorylase is
increased. These two events lead to increased hepatic glucose delivery to the blood.
6- In animal cells serine, threonine and tyrosine are the amino acids subject to
phosphorylation.
Sulfation
1- Sulfate modification of proteins occurs
at tyrosine residues such as in fibrinogen
and in some secreted proteins (eg
gastrin).
2- The universal sulfate donor is 3'phosphoadenosyl-5'-phosphosulphate
3- Since sulfate is added permanently it
is necessary for the biological activity
and not used as a regulatory
modification like that of tyrosine
phosphorylation.
Prenylation
1- Prenylation refers to the addition of the 15 carbon farnesyl group or the 20
carbon geranylgeranyl group to acceptor proteins, both of which are isoprenoid
compounds derived from the cholesterol biosynthetic pathway.
2- The isoprenoid groups are attached to cysteine residues at the carboxy
terminus of proteins in a thioether linkage (C-S-C).
3- Some of the most important proteins whose functions depend upon prenylation
are those that modulate immune responses.
4- These include proteins involved in leukocyte motility, activation, and
proliferation and endothelial cell immune functions.
5- It is these immune modulatory roles of many prenylated proteins that are the
basis for a portion of the anti-inflammatory actions of the statin class of
cholesterol synthesis-inhibiting drugs
Vitamin C-Dependent Modifications
1- Modifications of proteins that depend upon vitamin C as a cofactor include
proline and lysine hydroxylations and carboxy terminal amidation.
2- The hydroxylating enzymes are identified as prolyl hydroxylase and lysyl
hydroxylase.
3- The donor of the amide for C-terminal amidation is glycine. The most
important hydroxylated proteins are the collagens.
4- Several peptide hormones such as oxytocin and vasopressin have Cterminal amidation.
Vitamin K-Dependent Modifications
1- Vitamin K is a cofactor in the carboxylation of glutamic acid
residues.
2- The result of this type of reaction is the formation of a γcarboxyglutamate (gamma-carboxyglutamate), referred to as
a gla residue.
3- The formation of gla residues within several proteins of the
blood clotting cascade is critical for their normal function.
4- The presence of gla residues allows the protein to chelate
calcium ions and thereby render an altered conformation and
biological activity to the protein.
5- The coumarin-based anticoagulants, warfarin and
dicumarol function by inhibiting the carboxylation reaction.
Structure of a gla Residue
Selenoproteins
1- Selenium is a trace element and is found as a component
of several prokaryotic and eukaryotic enzymes that are
involved in redox reactions.
2- The selenium in these selenoproteins is incorporated as a
unique amino acid, selenocysteine, during translation.
3- A particularly important eukaryotic selenoenzyme is
glutathione peroxidase.
4- This enzyme is required during the oxidation of glutathione
by hydrogen peroxide (H2O2) and organic hydroperoxides.
Structure of the Selenocysteine
Residue
Ubiquitin and Targeted Protein Degradation
1- Proteins are in a continual state of flux, being synthesized and degraded.
2- In addition, when proteins become damaged they must be degraded to prevent
aberrant activities of the defective proteins and/or other proteins associated with those
that have been damaged.
3- Proteins that are to be degraded by the proteosome are first tagged by attachment of
multimers of the 76 amino acid polypeptide ubiquitin.
4- Many proteins involved in cell cycle regulation, control of proliferation and
differentiation, programmed cell death (apoptosis), DNA repair, immune and
inflammatory processes and organelle biogenesis have been discovered to undergo
regulated degradation via the 26S proteosome.
5- Of clinical significance are the recent findings that deregulation of the functions of the
proteosome can contribute to the pathogenesis of various human diseases such as
cancer, myeloproliferative diseases, and neurodegenerative diseases.