Download Control of Gene Expression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Extrachromosomal DNA wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Genome evolution wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Genome (book) wikipedia , lookup

RNA wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Polyadenylation wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Transcription factor wikipedia , lookup

Minimal genome wikipedia , lookup

History of RNA biology wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Messenger RNA wikipedia , lookup

Microevolution wikipedia , lookup

Non-coding DNA wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

RNA silencing wikipedia , lookup

RNA interference wikipedia , lookup

History of genetic engineering wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression profiling wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Point mutation wikipedia , lookup

Epitranscriptome wikipedia , lookup

Non-coding RNA wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Gene wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

RNA-Seq wikipedia , lookup

Epigenetics of human development wikipedia , lookup

NEDD9 wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Primary transcript wikipedia , lookup

Transcript
Control of Gene Expression
Stem Cells and Differentiation
• Multicellular organisms contain many different cell types: think of
nerve cells, skin cells, liver cells, bone cells, etc.
• Humans have about 220 different cell types, with about 1014 total cells
in an adult.
• All of these arose from a single cell, the zygote (fertilized egg).
• The zygote is totipotent: it has the ability to become any cell type
found in the adult body or in fetal tissues such as the placenta.
• Cells in the early embryo (inner cell mass of the blastocyst) that can
form any adult cell type are called pluripotent. These are embryonic
stem cells.
• As embryonic development proceeds, cells get channeled into more
and more specific pathways leading to their final cell type.
• Many tissues contain adult stem cells, which are multipotent, meaning
that they can form a range of related cell types, but not all possible cell
types.
• Example: hematopoietic stem cells can become red blood cells or any one
of several white blood cell types.
• As cells move from being totipotent to pluripotent to multipotent
to their final cell type, they are undergoing the process of
differentiation. A cell in its final form is said to be terminally
differentiated.
Conrad Waddington (1957)
imagined the process of
differentiation as a ball rolling
down a hill, going into more and
more specialized valleys,
representing the final cell types.
Hematopoietic Cell Lineages
All Cells in an Organism Have the Same Genes
• In most cases, terminally differentiated cells cannot dedifferentiate of change their cell type.
• It was once thought this was due to a loss of genes.
• We now know that the difference between cell types is
which genes are active and which genes aren’t.
• We know this because it is possible to take the nucleus
from a differentiated cell, inject it into an egg (nucleus
removed) and get a whole functioning organism back.
• Some treatment of the nucleus is necessary, because
differentiated cells have mechanisms to permanently
turn off unnecessary genes.
• It is possible to create induced pluripotent stem cells,
which are adult cells that have been artificially dedifferentiated into pluripotent stem cells. The process
involves activating 4 transcription factors, which is
relatively simple but poses a risk of cancer.
• DNA studies on different cells show the same thing:
there is no loss of DNA or genes as differentiation
proceeds.
Early Experiments Demonstrating Pluripotency
Somatic Cell Nuclear Transfer (SCNT)
Somatic Cell Nuclear Transfer is the cloning
animals by transferring the nucleus of a
differentiated cell into an unfertilized egg cell. It
has become fairly common, and it works well on
cows, mice, goats, and a variety of other
mammals. At this point in time, human embryos
created by SCNT have not developed past the 8
cell stage. (and it raises serious ethical
questions)
Dolly the Sheep (1996)
Control of Gene Expression
• Different cell types express different genes. Also, cells respond to
changing environments by changing their pattern of gene expression.
• Control can occur at many levels.
• But, control of transcription is the most important: genes only make
messenger RNA when the gene product is needed.
• We will also look at some control mechanisms beyond transcription
Prokaryotic Control Mechanisms
The Lac Operon
• The first understanding of gene regulation
came from studying the lac operon in E. coli, by
Jacob and Monod in the early 1960’s.
• Lac stands for lactose utilization.
• The lac operon produces beta-galactosidase, the
enzyme that degrades the disaccharide lactose into
glucose and galactose. (lacZ gene)
• The lac operon also produces lactose permease, a
protein that transports lactose into the cell (lacY
gene), and acetyltransferase, whose actual function
isn’t clear (lacA gene).
• The lac operon transcribes all three of these genes
on a single polycistronic messenger RNA.
• When lactose is present, the lac operon is
transcribed, and beta-galactosidase is made.
• When lactose is absent, the operon is not
transcribed; no beta-galactosidase is made.
Control of the lac operon
• The primary control factor is a protein, the lac
repressor.
• The lac repressor is made by a separate gene, lacI. The
lacI gene is expressed constituitively: it is transcribed
at a low rate all the time. Thus, the repressor is always
present in the cell.
• The lac repressor binds tightly to the lac operator, a
region of DNA near the promoter. The repressor bound
to the DNA prevents binding of RNA polymerase to the
promoter, and thus prevents transcription.
• However, when lactose is present, some of it gets
altered to form allolactose, which is the inducer of the
lac operon. Allolactose then binds to the lac repressor,
changes its conformation, and causes it to fall off the
operator DNA. Then RNA polymerase binds to the
promoter and transcription occurs.
• This is an example of negative regulation: when the
regulatory protein (lac repressor) is bound to the DNA,
gene expression is repressed. Repressor proteins turn
genes off when they are bound to control sequences
near the gene.
Positive Regulation of the lac operon
• E. coli prefers glucose to any other food. So, when glucose is present,
the lac operon is turned off, even if lactose is also present.
• The cell monitors its level of glucose. When the glucose
concentration is low, adenyl cyclase is activated. This enzyme
converts ATP to cyclic AMP (cAMP)
• Low glucose = high cAMP; high glucose = low cAMP.
• When the catabolite activating protein (CAP) has cAMP bound, it
binds to a site just upstream from the lac operon promoter.
• CAP helps RNA polymerase bind to the promoter: without CAP, RNA
polymerase doesn’t bind.
• So, RNA polymerase binds to the promoter and transcribes the lac
operon only when: glucose levels are low AND lactose is present.
• This is an example of positive regulation: the operon is transcribed
when the regulatory protein (CAP) is bound to the DNA. Activator
proteins turn genes on when bound to control sequences near the
gene.
Summary of lac operon regulation
• lacI makes the lac repressor
protein. lacI is adjacent to the lac
operon, but not part of it. And, it
doesn’t need to be adjacent,
since the lac repressor diffuses
freely to all parts of the cell.
• We say that the repressor acts in
trans: it can bind to any lac operator
sequence anywhere in the cell.
• The lac operator acts in cis only: it
only affects the lac operon it is
physically connected to.
• CAP is made by a gene elsewhere
in the genome. CAP acts in trans.
The trp Operon
• The trp operon in E. coli makes enzymes for
the biosynthesis of tryptophan (one letter
symbol = W). When the concentration of
tryptophan is low, the operon in turned on,
and it is off when the concentration of
tryptophan is high.
• The first level of control is a repressor protein
that binds to an operator DNA sequence to
prevent RNA polymerase from binding.
• Same as in lac operon.
• However, tryptophan acts as a co-repressor,
not an inducer. That is, the repressor protein
only binds to the operator if tryptophan is also
bound to the repressor.
• The repressor is coded for by the trpR gene,
which is constitutively expressed and located
away from the trp operon. The repressor
protein acts in trans: it can bind to any trp
operator sequence in the cell.
mRNA Attenuation at the trp Operon
• The trp operon has a second level of control that is based on the level
of tryptophan tRNA that is charged with the amino acid. When the
concentration of charged tRNAtrp is too low, a stem-loop transcription
termination signal is formed in the 5’ leader region of the mRNA, and
transcription terminates without transcribing the genes.
• The trp operon leader sequence is translated into a short peptide that
contains multiple tryptophan codons.
• The leader also contains 4 regions that can form stem-loops.
• If regions 3 and 4 pair, a stem-loop forms that is immediately followed by 7
U’s: this is a termination signal.
• If regions 2 and 3 pair, they form a stem-loop followed by G’s and C’s,
which allows transcription to continue.
• Region 1 contains several tryptophan codons.
• If there is plenty of charged tRNAtrp the ribosome translates these codons quickly,
which physically blocks region 2 from forming a stem-loop. In this case, regions 3
and 4 have a chance to get together and form the terminator stem-loop.
• If the concentration of tRNAtrp is low, the ribosome hesitates at the tryptophan
codons, waiting for the proper tRNA to appear. In this case, region 2 is free, which
allows it to form the 2-3 stem-loop. Since the 3-4 stem-loop terminator can’t form,
transcription of the operon proceeds.
Riboswitches
• One important theme in modern molecular biology
is how important and active RNA molecules are. In
this case, RNA can change conformation when it
binds a ligand, without any protein involvement.
• A riboswitch is a an RNA sequence in the 5’ leader
portion of a messenger RNA that controls gene
expression, depending on whether a ligand is bound.
• The ligand-binding portion of the RNA is called the
aptamer.
• Two control mechanisms (others also exist):
• In one configuration, the RNA forms a terminator
sequence (a stem-loop followed by a series of U’s),
thus terminating transcription before the genes are
transcribed.
• In one conformation, the ribosome binding site is
sequestered in a stem-loop, making it unavailable to
the ribosome, thus preventing translation.
• When the concentration of ligand is high, the operon
is turned off.
Two different riboswitches that respond to cobalamin
(vitamin B12). (A) A terminator is formed when B12
binds to the aptamer. Note the positions of the green
and blue sequences in the two conformations. (B) the
ribosome binding site (RBS) is unavailable to the
ribosome when B12 binds to the aptamer.
Eukaryotic Control Mechanisms
Eukaryotic Gene Regulation
• The regulation of eukaryotic genes shares many characteristics wiith
prokaryotic genes:
• Control sequences are found near the protein-coding genes
• Regulatory proteins bind to the control sequence to activate transcription
• Important differences:
• Eukaryotic DNA is wrapped around histones, and organization of chromatin
influences gene activity
• Many DNA sequences that affect transcription of specific genes are found
many kilobases away from the gene
• Messenger RNA needs to be transported out of the nucleus for translation in
the cytoplasm
Transcriptional Control
• Proteins that bind to DNA regulatory
sequences and affect transcription are called
transcription factors.
• Act in trans: they can affect any gene on any
chromosome in the same nucleus that has a
matching binding site.
• Proteins are translated in the cytoplasm and
migrate back into the nucleus to function.
• DNA regulatory sequences are adjacent to
the gene are said to
• Act in cis: they only affect the gene they are
attached to (and not other copies of the
gene in the cell).
• Classifying transcription factors:
• general transcription factors: involved in all
transcription complexes,
• tissue-specific transcription factors: only
used in certain tissues or with certain
external stimuli.
Cis vs. trans
Cis-Acting DNA
Sequences
• The most important DNA regulatory sequence is the promoter, the place where RNA polymerase binds and starts
transcription.
• There is no one single defined promoter sequence. Each gene has a different promoter sequence, with various
conserved elements.
• Five short sequences are conserved in eukaryotic promoters, but not all are found with all genes. All are located in
defined positions close to the transcription start point, with some upstream and some downstream of it.
• The best known is the TATA box, located about 25 bp upstream from the transcription initiation point. Like all these
elements, the TATA box is a consensus sequence: there are many slightly varying sequences that work as TATA boxes.
However, the TATA box is not present in all genes.
The TATA box consensus sequence
is shown in pink, with the
percentage of human TATA boxes
having each nucleotide in the
consensus shown below.
Tissue-specific Transcription Factors
• Tissue-specific transcription factors activate transcription in
specific cell types, or in response to specific signals.
• They bind to short DNA sequences that are near the promoter.
• Used to be thought that transcription factor binding sites were
upstream from the promoter, but it is now known they can be
either upstream or downstream from the promoter (but near it).
• they consist of short consensus sequences: 6-10 bp long, often 2
in a row. Allows a dimeric transcription factor to bind.
• Transcription factors can activate many different genes.
• Transcription factors are often activated by phosphorylation.
Position of 3 transcription
factor binding sites relative to
the transcription start.
Enhancers and Silencers
• Enhancers and silencers are tissue-specific cis-acting DNA sequences that increase or
decrease transcription regardless of their position (within limits, but can be several
megabases away) or orientation: they can be either 5’ or 3’ to the gene itself.
• Transcription factors bind to the enhancers and silencers. The DNA bends to bring the
transcription factors into contact with the RNA polymerase at the promoter.
Digression into Early Drosophila Development
• Drosophila eggs are formed by the mother before
fertilization. The eggs contain many messenger RNAs
that are translated only after fertilization. The genes
that make these messenger RNAs are called maternal
effect genes: the mother’s genotype determines the
offspring’s phenotype.
• After fertilization, the zygote nucleus divides many
times, until about 6000 nuclei are present. The
nuclei then migrate to the surface of the egg. Up to
this point, there are no cell membranes separating
the nuclei: they form a syncytium (a single cell with
multiple nuclei). Once the nuclei have reached the
egg surface, cell membranes form. The embryo is in
the blastoderm stage.
• During this stage, the blastoderm cells are assigned
to a particular body segment. The process in which a
cell’s fate is decided is called determination. The
actual segments take a while to develop after the
moment of determination.
Genes Determining the Drosophila Body Plan
• Several groups of genes are needed
to create the segmentation pattern.
• First, the maternal effect genes form
morphogen gradients that establish
the basic anterior-posterior (i.e.
head-tail) axis.
• Then gap genes divide the embryo
into several different regions
• Pair rule genes establish pairs of
segments. Since their expression
overlaps, together these genes
determine the identity of each
segment.
• Finally, segment polarity genes
create an anterior-posterior gradient
within each segment.
Morphogen Gradients
• The basic idea behind a morphogen gradient is that
you have cells at one end of an embryo secreted a
morphogen, a compound that diffuses throughout
the embryo, with a high concentration oat one end
and a low concentration at the other end. The cells
at different points along the gradient respond to the
local morphogen concentration by developing into
different cell types.
• In Drosophila, the anterior-posterior axis is
determined by 2 morphogen gradients laid down in
the unfertilized egg by the mother. The initial
morphogens are messenger RNAs, which get
translated into proteins (the actual morphogens)
after fertilization.
• The initial gradients are formed by bicoid and nanos
mRNAs, which then create gradients of hunchback and
caudal proteins.
Gap and Pair-Rule Genes
• Once the basic morphogen gradients are
established, the gap genes are activated. These
genes respond to different levels of the
morphogens. Gap genes are active in large
regions of the embryo, dividing it up.
• The gap genes are transcription factors, and
they activate the next set of genes, the pair-rule
genes. Which pair rule genes are active in a
given cell depends on the combination of gap
gene proteins that are present.
• Pair-rule genes are active in alternating pairs of
segments. Together, they determine the identity
of each segment.
Even-skipped
• Even-skipped is a pair rule gene that is
active in 7 bands in the embryo.
• The Drosophila embryo has 14
segments.
• The upstream control region for this
gene consists of 7 modules, one for
each band of activity.
• Each module has binding sites for 4
different transcription factors, which
came from 4 of the maternal effect and
gap genes. If the proper combination
(and concentrations) of transcription
factors is present, the module is
activated and the even-skipped gene is
activated.
• Even-skipped is itself a transcription
factor that activates genes downstream
in the developmental pathway.
Transcription Factors
• Transcription factors generally have two
functional sections (domains): a DNA-binding
domain that attaches to the specific DNA
sequence, and an activation domain that
stimulates transcription. The activation domain
works by allowing other transcription factors to
create the transcription complex.
• The DNA-binding domains fall into several
general types, and proteins that have one of
these domains are usually assumed to be
transcription factors.
• Leucine zipper motif. An alpha helix that has a
leucine every 7 amino acids, so all the leucines
are on the same side of the molecule. This
allows the protein to form a dimer by
hydrophobic interactions. This dimer grips the
DNA double helix
More Transcription Factors
• Zinc finger motif: binds a Zn2+ ion between two
cysteines and two histidines (C2H2 proteins) or
between four cysteines (C4 proteins).
Sometimes a zinc finger protein will have more
than one zinc finger motif.
• Helix-turn-helix motif consists of two alphahelices connected by a short region of other
amino acids. The two helices bind the DNA
major groove. This is a common motif in
homeobox gene regulation.
• Helix-loop-helix motif, which is different from
the HTH motif. HLH has a much longer
connecting loop that allows more flexibility in
the molecule.
A Little Digression into Hox Genes
• A homeotic mutant converts one body structure into
another.
• An example is the Drosophila mutant Antennapedia, which converts
the antennae on the head into legs.
• Another example is Bithorax, which converts the front segment of the
thorax into a middle segment. The halteres, which are small balancing
organs on the front segment, are converted into wings.
• It was discovered that both the Antennapedia and the
Bithorax genes are part of gene clusters that code for very
similar transcription factors.
• The genes all contain a common 60 amino acid motif, called a
homeodomain. This domain folds into a helix-turn-helix DNA binding
region.
• The homeodomain is coded as 180 nucleotides of DNA, a region called
a homeobox. Genes with homeoboxes are called Hox genes.
Hox Gene Expression
• The Hox genes in Drosophila are arranged in 2
clusters, with different genes expressed in
different body segments.
• The identity of body segments is determined by
which Hox gene is expressed.
• The Hox gene transcription factors turn on whole
sets of genes that create the unique features of each
segment.
• The genes are organized on the chromosome
in the same order as the body segments they
control.
• Homeotic mutations are due to the expression of a
Hox gene in the wrong body segment.
• The homeobox genes are highly conserved in
evolution. They are found in the same order in
all bilateran animals, and the homeobox genes
from chickens work perfectly well in
Drosophila.
• Mammals have 4 clusters of Hox genes
(Drosophila have 2).
• The genes are located on the chromosomes in the
same order as the body segments they are
expressed in.
• Lower animals (such as sponges and cnidarians)
have Hox genes, but not in clusters and not
expressed in different body segments.
• Plants also have Hox genes (also not clustered).
Hox Genes in Other
Species
Transcript Isoforms
• At least half of all human genes are expressed in different
ways in different tissues. Different transcriptional start sites,
different intron splicing patterns, and different poly A
addition sites can give quite a few different proteins from
the same gene.
• Different proteins from the same gene are called isoforms.
• Isoforms are produced in different tissues, different times in
development, different subcellular locations (soluble vs.
membrane-bound, for instance), etc.
• Dystrophin, the Duchenne muscular dystrophy protein, has
at least 7 different transcription start sites, used in different
tissues. (B, brain; M, muscle; P, Purkinje; R, retina; B,K, brain
and kidney; S, Schwann cells; G, general)
• A good example of alternate splicing patterns in different
tissues is tropomyosin, which has 5 optional exons.
Tropomyosin is a protein in striated muscle that binds to
actin and prevents it from interacting with myosin: thus it
regulates muscle movements.
Control of Alternative Splicing
• RNA splicing is performed by snRNPs, small nuclear
ribonucleoprotein complexes, which are RNA/protein
hybrids.
• snRNPs bind to the splice site donor sequence (GT)
and the acceptor sequence (AG), then catalyze the
removal of the intron.
• Variations in snRNPs (as well as other proteins) occur
in different cells and recognize slightly different
splicing signals.
• Proteins bind to splice enhancer and splice
suppressor sequences, which can be located in
introns or exons. These proteins influence the binding
of snRNPs to the splice sites, increasing or decreasing
the likelihood that a given site will be actually used as
a splice site.
• The splicing proteins also assist in transporting mRNA
out of the nucleus: an mRNA with splicing proteins
attached to intron sequences is not allowed to leave
the nucleus; at the same time, the splicing proteins
attached to the exons are necessary for transport.
Messenger RNA Stability
• Bacterial mRNAs have a half-life of a few minutes
• Eukaryotic mRNAs generally are more stable: half-life of
several hours
• Most eukaryotic mRNAs have their poly-A tails slowly
removed starting at the 3’ end by a deadenylating nuclease
enzyme. When enough has been removed, the poly-A
binding proteins that stabilize the circular mRNA translation
complex can no longer bind. This leads to degradation of the
mRNA.
• The 5’ cap is removed, and exonucleases degrade it from both
ends. This occurs in regions of the cytoplasm called P bodies.
• The rate of removing the poly A tail depends on the
frequency of translation initiation: the more translation
starts, the slower the degradation. There is a competition
between starting translation and removing the A’s from the
3’ end.
• Short lived mRNAs contain AU-rich sequences (like AUUUA)
in their 3’ UTRs. These sequences bind to RNA exonucleases
and speed degradation.
Nonsense-mediated Decay
• Nearly all mRNAs have their stop codon in the last
exon. Messenger RNAs that contain stop codons
before the last exon are subject to degradation.
• A stop codon is also called a nonsense codon
• Such mRNAs are the products of defective genes, or
they haven’t been spliced properly (introns usually
contain many stop codons).
• Exon boundaries are marked by same splicing
enhancer proteins that allow transport of the mRNA
out of the nucleus. Called exon junction
complexes.
• During the first round of translation, the ribosome
displaces all of the exon junction complexes.
• Recall that the ribosome falls off the mRNA at the
first stop codon it reaches. If this stop codon in not
in the last exon, the final exon junction complex
won’t be removed.
• Messenger RNAs that still have these
complexes attached become
associated with P-bodies and
degraded there.
• About 10% of eukaryotic mRNAs are
degraded by this mechanism.
RNA Interference
• RNA interference (RNAi) is a phenomenon that controls
gene expression by either cleaving messenger RNA or
preventing it from being translated.
• RNAi is a common lab technique for suppressing gene
expression.
• RNAi starts with double stranded RNA (dsRNA).
• There are 2 major forms of RNAi:
• siRNA (short interfering RNA) starts with dsRNA that
comes from a source outside the cell. The original
discovery of siRNA involved RNA viruses that replicate
using a dsRNA intermediate.
• miRNA (microRNA) starts with RNA transcribed in the
nucleus, either from RNA-only genes or from spliced out
intron RNA sequences. These RNAs form hairpin loops
that are then processed into dsRNA.
• RNAi acts as a defense against RNA viruses that replicate
through dsRNA, and it suppresses movement of
retrotransposons.
MicroRNA
• Micro RNAs (miRNA) are very short (about 22 bases) RNA
molecules that are found in (probably) all eukaryotes and which
function to regulate gene expression by either cleaving
messenger RNAs or by preventing translation of messenger
RNAs.
• miRNAs come from 2 sources: many are transcribed by RNA
polymerase 2 from RNA-only genes. Others are derived from
spliced out intron sequences.
• The common feature is that the RNA folds back on itself to form a
hairpin loop.
• The primary miRNA transcript is usually quite long. It gets
processed in the nucleus to form a shorter hairpin RNA, called
the pre-miRNA. Processing is done by a ribonuclease called
Drosha.
• Some miRNA primary transcripts contain several different hairpin
loops, which get processed independently to form different
miRNAs
• After processing by Drosha, the pre-miRNA haipins are exported
to the cytoplasm, where they are further processed by Dicer.
Dicer and RISC
• When the dsRNA reaches the cytoplasm, it binds to Dicer, which is a
ribonuclease. Dicer cuts the dsRNA down to a short (about 20-25 bp)
dsRNA; these short dsRNAs are known as siRNA or miRNA. The dsRNA
before this step is called pre-siRNA or pre-miRNA.
• Dicer then assists in binding the dsRNA to the RISC (RNA-induced
silencing complex) complex. Only one strand of the dsRNA ends up in
RISC: this is called the guide strand. The other strand is degraded.
• The RISC complex then binds to mRNA molecules complementary to
the RNA in the RISC complex.
• The active component of most RISC complexes is an endonuclease
called Argonaute. It cleaves the mRNA base paired with RISC complex.
This happens when the base pairing between siRNA and mRNA is a
perfect match.
• Argonaute is a family of proteins, which act on different sets of mRNAs
• Another effect of RISC binding to mRNA is to prevent it from being
translated, as opposed to cleaving the mRNA. This happens when the
base-pairing between the siRNA and mRNA is imperfect.
CRISPR Systems
• CRISPR (clustered regularly spaced short
palindromic repeats) systems are used by
prokaryotes to provide resistance to invading
foreign DNA.
• CRISPR arrays are DNA sequences that are derived
from foreign DNA that has entered the cell.
• CRISPR arrays are adjacent to genes for CAS
proteins (CRISPR-Associated Proteins). CAS proteins
cut DNA sequences that are complementary to
guide RNAs transcribed from the CRISPR array. This
provides protection from repeated attack by the
same foreign DNA.
• The CRISPR arrays provide a record of which foreign
DNAs entered the cell and its ancestors,
• CRISPR systems are starting to find use for editing
genomes: correcting genetic diseases in people who
are already born.
Acquisition of New Sequences to a CRISPR Array
• CRISPR arrays are groups of repeated sequences, 20-50
bp long, separated by spacer sequences. Each of the
repeated sequences is an inverted repeat that can fold
up into a hairpin loop. CRISPR arrays are transcribed as
a single RNA molecule.
• Spacer sequences are located between the repeat
sequences. They are derived from DNA that has
entered the cell (or one of its ancestors) from the
outside: phage DNA, plasmids, or just random DNA.
• Foreign DNA is recognized by the Cas1 and Cas2
proteins, which cut it into short fragments.
• The fragments are then inserted into the CRISPR array
between 2 repeats near the 5’ end. More accurately,
the 5’-most repeat is duplicated and the foreign DNA
fragment is used as the spacer between the repeats.
• These short fragments of foreign DNA become part of
the bacterial chromosome, which is inherited by all
descendant cells.
CRISPR Immune Response
• There are several different types of CRISPR system, which differ in
the details of how they work. The presentation here is meant to
be fairly generic.
• The entire CRISPR array is transcribed as a single RNA molecule.
• The CRISPR RNA is cut at each of the repeats, creating a set of
individual spacer elements. These RNA fragments are called
crRNA (CRISPR RNA) or gRNA (guide RNA).
• Details of how the cutting occurs depends on the type of system.
• The guide RNA is then incorporated into a protein complex
composed of one or more Cas proteins. This complex is (in some
cases) called Cascade.
• Again, details vary between system types.
• The Cascade complex binds to complementary DNA double helixes
(from invading DNA) and cuts them. This destroys the foreign
DNA’s ability to reproduce inside the cell.
Use in Genetic Engineering
• CRISPR systems can attack any DNA sequence that matches the
guide RNA. Making the guide RNA is just a matter of inserting it
into a CRISPR array.
• Very similar to restriction enzymes in that both cleave specific DNA
sequences. But restriction enzymes are proteins, and it is hard to
design a protein that is highly sequence specific. In contrast, CRISPRs
use RNAs that are complementary to the DNA target: very easy to
design.
• CRISPRs could serve as a way to attack viral infections. Currently there
are very few treatment options for viral diseases.
• For research purposes, CRISPRs can cut the DNA of any gene,
destroying it. This is called a gene knock out. Knock outs are also
done using RNAi, which acts on the messenger RNA, not the gene
itself.
• Another option: you can block transcription of a gene by using a
non-functional Cascade complex. The guide RNA binds to DNA just
downstream from a transcription start and the Cascade complex
simply blocks RNA polymerase (and doesn’t cut the DNA).
More CRISPR Genetic Engineering
• CRISPRs can be used to insert foreign genes into a genome, a
process called knock in.
• Recall that double stranded DNA breaks can be repaired by
homologous recombination.
• CRISPRs cause double stranded breaks, and if you use an artificial
sequence that flanks the break point but also includes a foreign
gene, the foreign DNA will be inserted.
Global Regulation of Translation by eIF2
• A review of translation initiation in eukaryotes:
• The translation initiation factor eIF2 binds to GTP and
the initiator Met-tRNA . This complex then binds to the
small ribosomal subunit and several other initiation
factors, forming the pre-initiation complex. The preinitiation complex then binds to the 5-methyl guanosine
cap at the 5’ end of the mRNA.
• This pre-initiation complex then scans along the mRNA
looking for the first AUG codon. When it finds the AUG,
eIF2 hydroyzes GTP to GDP. This action locks the
initiator tRNA into place on the P site of the ribosome,
base paired with the AUG.
• The GTP hydrolysis also releases eIF2 from the complex.
eIF2 then needs to have its GDP replaced with a new
GTP, which is done by another protein, the guanosine
exchange factor eIF2B.
• Regulation of translation occurs by the
phosphorylation of a serine in eIF2 by eIF2
kinase.
• The phosphorylated eIF2 binds tightly to eIF2B
and prevents it from exchanging GTP for GDP on
any other eIF2 proteins.
• This causes a buildup of eIF2 with GDP, which
can’t initiate translation. Thus, translation for
all proteins in slowed down.
• Human cells contain 4 different eIF2 kinases,
which respond to different stress conditions:
low amino acid concentrations, low oxygen, etc.
• Note: there are 3 proteins with similar names
but very different functions here:
1. eIF2, the initiation factor that binds to MettRNA, but only when GTP is bound. eIF2 is a G
protein.
2. eIF2B, the guanosine exchange factor that
puts a fresh GTP onto eIF2 after translation
initiation is complete.
3. eIF2 kinase, an enzyme that phosphorylates
eIF2 when the cell is being stressed.
eIF2 Regulation
Translational Control
• Regulation of whether the messenger RNA is translated or not.
• The best studied example is ferritin, a protein that stores up to
4500 iron atoms (as iron hydroxyphosphate) in its center.
Ferritin is the main way iron is stored in the body. The ferritin
mRNA is only translated when iron levels are high.
• The ferritin mRNA contains an iron-response element in the 5’
UTR. The IRE folds up into a hairpin loop, which can bind to the
IRE-binding protein. When iron levels are low, IRE-BP binds and
prevents translation of the mRNA. This allows the ferritin mRNA
to remain intact while preventing any further sequestration of
iron atoms by ferritin.
• The transferrin receptor is an integral membrane protein that
binds to transferrin, the major iron-carrying protein in the blood
serum, and brings iron into the cell.
• The transferrin receptor is also regulated by the IRE system, but
in the opposite direction as ferritin. High iron levels in the
blood cause the transferrin receptor mRNA to be rapidly
degraded, but low iron allows the mRNA to be translated,
producing more receptor proteins.
• The transferrin receptor mRNA contains 3 IREs in the 3’ UTR.
RNA degradation is prevented by IRE-BP binding.
Control of Protein Degradation
• To react quickly to the environment, a cell must be able
to remove outdated signals quickly. Many proteins,
especially regulatory signaling proteins, are degraded by
ubiquitin-mediated proteolysis.
• Ubiquitin is a small protein that is highly conserved in
evolution.
• In this system, multiple copies of ubiquitin are covalently
attached to the target protein in long chains. The
complex is then transported to the proteosome, a large
multi-subunit barrel-shaped structure. The proteosome
degrades the target protein to amino acids and recycles
the ubiquitin.
• Target specificity is provided by E3-ubiquitin ligase, the
enzyme that attaches ubiquitin to the target proteins:
there are hundreds of different E3-ubiquitin ligases.
• One target is hydrophobic amino acids that are normally
buried in the protein’s interior or within membranes.
• N-end rule: On average, a protein's half-life correlates
with its N-terminal residue.
• Proteins with N-terminal Met, Ser, Ala, Thr, Val, or
Gly have half lives greater than 20 hours.
• Proteins with N-terminal Phe, Leu, Asp, Lys, or Arg
have half lives of 3 min or less.
The proteosome also re-folds misfolded
proteins if the proteins are protected from
degradation by chaperone proteins.
Misfolding is a common result of heat shock.
Ubiquitin plays a number of other roles in the
cell, including cell signaling and X
chromosome inactivation.