Download Next Step Bio Supplement

Document related concepts

Epitranscriptome wikipedia , lookup

Genome evolution wikipedia , lookup

Lac operon wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Secreted frizzled-related protein 1 wikipedia , lookup

Non-coding RNA wikipedia , lookup

Gene expression profiling wikipedia , lookup

Molecular cloning wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Community fingerprinting wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Eukaryotic transcription wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Non-coding DNA wikipedia , lookup

Molecular evolution wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene wikipedia , lookup

Gene regulatory network wikipedia , lookup

List of types of proteins wikipedia , lookup

Point mutation wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Gene expression wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Transcript
Next Step MCAT Content Review: Biology and Biochemistry
CHAPTER 35
Control of Gene Expression in Prokaryotes
A. INTRODUCTION
Genomic DNA represents the set of genetic instructions which ultimately govern all cellular activities.
In both prokaryotes and eukaryotes, these instructions must be implemented via the synthesis of RNA
and proteins. Thus, the portion of a cell’s genetically-inherited instruction set that is carried out at any
point in time is a function of the genes that are actively expressed. The first step in expression of a
gene, the transcription of DNA to RNA, is common to both prokaryotic and eukaryotic cells. In this
chapter, we will explore the mechanisms by which prokaryotic cells accomplish the regulation of gene
expression at the transcriptional level.
In most prokaryotes, the logic of transcriptional regulation is focused on the conservation of
limited cellular resources, which are often insufficient to support the transcription (and later
translation) of all of a cell’s structural genes. As a consequence, only genes encoding proteins that
support ongoing cellular functions are expressed continuously. The transcription of the remainder of a
cell’s structural genome is regulated in response to the needs of the cell. If a protein is not immediately
needed, its transcription will be suppressed. If a protein or a set of functionally related proteins is
required by a cell, then a signaling system will initiate transcription of the pertinent structural gene or
genes.
B. PROKARYOTIC TRANSCRIPTION
While prokaryotic transcription technically encompasses the transcription of both bacteria and
archaea, on the MCAT, and in most of molecular biology, it is synonymous with bacterial
transcription. The principle enzyme responsible for the synthesis of RNA in bacteria is bacterial RNA
polymerase, which catalyzes the polymerization of ribonucleoside 5’-triphosphates (NTPs). As in the
synthesis of DNA by DNA polymerase, RNA polymerase catalyzes the growth of an RNA polymer
exclusively in the 5’ to 3’ direction. Unlike DNA polymerase, RNA polymerase does not require a
preformed primer in order to initiate synthesis, but is initiated at specific sites at the beginning of
genes. This de novo process of initiation is a particularly important step in the regulation of prokaryotic
transcription.
600
Chapter 35: Control of Gene Expression in Prokaryotes
I. RNA Polymerase Structure
The intact bacterial RNA polymerase is a metalloenzyme composed of two zinc molecules and five
different types of subunits, shown in Figure 35.1: α, β, β’, ω, and σ. The core catalytic subunit of the
enzyme, which consists of two α subunits as well as one β, one β’, and one ω subunit, is fully capable of
catalyzing the polymerization of NTPs.
Figure 35.1: Subunit composition of bacterial RNA polymerase
However, the core polymerase subunit does not bind specifically to the DNA sequences that
direct incoming polymerase enzymes to their correct position relative to the start site of transcription.
The σ subunit, which is weakly bound to the core subunit and is sometimes referred to as a sigma
factor, serves this purpose and is needed for the correct identification of the transcriptional start site.
This recognition and binding by the σ subunit is required for the synthesis of a complete, functional
mRNA molecule. Many bacteria are capable of producing different sigma factors, each of which
recognize different promoter regions. For example, E. coli can produce seven different factors that
initiate transcription of a factor-specific subset of genes. Some sigma factors, distinguished from one
another by their molecular weight, bind to the promoters of genes that encode proteins or RNA
molecules that are required for essential “housekeeping” processes. Other sigma factors direct RNA
polymerase to the promoters of genes that encode more specialized functions, such as proteins
required for adaptation to environmental stresses or those needed for the metabolism of nitrogen. The
relative rates of synthesis of specific sigma factors, then, represent one means by which bacterial cells
modulate transcription of specific genes, dependent upon the nature of the gene and on environmental
signals.
601
Next Step MCAT Content Review: Biology and Biochemistry
II. Initiation
The DNA sequences to which the σ subunit of RNA polymerase bind when initiating transcription are
part of the promoter. These sequences are located approximately 10 and 35 nucleotides upstream of
the transcription initiation site, which is defined as the +1 position. Appropriately, sequences found in
these promoter regions are called the -10 and -35 elements. While these sequences are not identical in
all promoters, they are similar enough to establish consensus sequences, the bases most frequently
found at each position in many prokaryotic organisms. The six-nucleotide consensus sequence
associated with the -10 element, 5’-TATAAT-3’, is known as the Pribnow box, and is similar to the
TATA box found in eukaryotes. The initial binding between the polymerase and a promoter forms the
pre-initiation complex (PIC). The PIC is a closed-promoter complex, referring to the fact that the
DNA contained in the complex is not yet unwound. Binding of the RNA polymerase holoenzyme to
the promoter region of DNA and the formation of the PIC is shown in Figure 35.2.
Figure 35.2: Promoter identification and assembly of the PIC
After formation of the PIC, RNA polymerase then unwinds 12-14 bases of DNA from about 12 to +2. This forms an open-promoter complex in which single-stranded DNA is available as a
template for transcription. Transcription is initiated by the β subunit. After the addition of
approximately 10 nucleotides in the form of NTPs, σ is released; the core subunit then moves along
the template DNA, carrying out elongation of the growing RNA chain. The transcription of a basic
prokaryotic gene is dependent on the strength of its promoter. In the absence of other regulatory
elements, a promoter’s sequence-based affinity for RNA polymerases varies, leading to the production
of differing amounts of transcripts depending upon the gene transcribed. The extent of that variable
affinity is a function of the degree of similarity between the nucleotide sequence of the promoter and
602
Chapter 35: Control of Gene Expression in Prokaryotes
the consensus sequence. As sequence similarity increases, so does the binding affinity of promoter for
RNA polymerase.
III. Elongation
During elongation, the polymerase remains associated with the template strand as it continuously
synthesizes mRNA, unwinding the template DNA ahead of it and rewinding the DNA behind it.
Within the unwound portion of DNA, eight to nine base pairs of the growing RNA chain are bound to
the complementary template strand of DNA. Structural analysis indicates that during elongation, the β
and β’ subunits form a structure that maintains the association between RNA polymerase and the
DNA template, while also forming a channel through which the template strand passes and in which
the polymerase active site can be found.
IV. Termination
Synthesis continues until the polymerase encounters a termination signal, at which point transcription
stops, the newly synthesized RNA strand is released from the polymerase, and the enzyme dissociates
from its DNA template. In E. coli – the bacterial model of prokaryotic transcription which you are
most likely to encounter on Test Day – there are two alternative mechanisms of terminating
transcription. The most common termination signal consists of a symmetrically inverted repeat of a
GC-rich sequence followed by approximately seven A residues. Transcription of the GC-rich inverted
repeat results in the formation of a segment of RNA that can form a stable stem-loop structure by
complementary base pairing. This self-complementary structure, known as the hairpin terminator,
interacts with the transcription factor NusA, disrupting the association between RNA polymerase and
the DNA template at the β subunit. This causes the termination of transcription. Because hydrogen
bonding between A and U is weaker than hydrogen bonding between G and C, the presence of A
residues downstream of the inverted repeat sequence of G and C is thought to facilitate the
dissociation of the more tightly bound, self-complementary G and C base pairs from their template.
This process is termed Rho-independent termination.
Alternatively, transcription can be terminated in some genes by a specific termination protein
called Rho, which binds extended segments of nucleotides found in single-stranded RNA. Since
mRNA in bacteria become associated with ribosomes and are translated while still being transcribed,
such extended regions of single-stranded RNA are exposed to binding by Rho only at the end of an
mRNA. This is known as Rho-dependent termination. Not surprisingly, the Rho-dependent
terminator occurs downstream of the translation stop codon and is composed of a non-repeating,
cytosine-rich sequence known as a Rho utilization site (rut) and a downstream transcription stop point
(tsp). rut serves as the mRNA binding site for Rho. Activation of Rho, which occurs upon binding rut,
603
Next Step MCAT Content Review: Biology and Biochemistry
allows Rho to hydrolyze ATP. This hydrolysis powers Rho’s translocation along the mRNA strand,
which continues until Rho makes contact with RNA polymerase, which has stalled at tsp. At this point,
Rho disrupts the mRNA-DNA-RNA polymerase transcriptional complex through a mechanism
involving the allosteric effects of Rho on RNA polymerase. A schematic depicting Rho-dependent and
Rho-independent termination is shown in Figure 35.3.
Figure 35.3: (A) In Rho-independent termination, the terminating hairpin forms from
self-complementary sequences of the nascent mRNA. Upon interaction with NusA, it
promotes release of the primary transcript from the RNA polymerase-DNA complex. (B)
In Rho-dependent termination, Rho first binds to the rut site. Once activated, it
translocates downstream until it reaches the RNA polymerase complex, stimulating
release of the transcript.
C. NEGATIVE CONTROL OF TRANSCRIPTION
Transcription can be regulated at the stages of both initiation and elongation, but most transcriptional
regulation in bacteria operates at the level of initiation. An MCAT favorite for this content area is
gene regulation and the expression of genes involved in the metabolism of lactose, a carbon and
energy source, by E. coli. To conserve the cellular resources required for enzyme synthesis, the enzyme
that catalyzes the cleavage of lactose to glucose and galactose, β-galactosidase, along with other
enzymes involved in lactose metabolism, are expressed only when lactose is available for use by the
bacteria. In other words, lactose induces the synthesis of enzymes involved in its own metabolism. In
addition to requiring β-galactosidase, lactose metabolism involves the products of two other enzymes:
lactose permease, a transmembrane symporter that imports β-galactosidase into the cell, and a
transacetylase, which is thought to inactivate thiogalactosides that are transported into the cell along
with lactose by the permease enzyme.
I. Lac Operon
604
Chapter 35: Control of Gene Expression in Prokaryotes
The genes encoding β-galactosidase, permease, and transacetylase – lacZ, lacY, and lacA, respectively –
are expressed as a single contiguous unit in the chromosome. This arrangement is called an operon.
Generally, an operon is associated with a single promoter and is transcribed as a large, polycistronic
mRNA molecule. Transcription of the operon is controlled by the sequence of regulatory genes known
as the operator, which is adjacent to the transcription start site and immediately upstream of the
structural gene for β-galactosidase. The lacI gene encodes a protein that regulates transcription by
binding to the operator. Such a gene product is referred to as a repressor, which blocks transcription
when bound to the operator region. The repressor acts by binding to the operator in such a way as to
overlap with the promoter region, thereby preventing the binding or movement of RNA polymerase
on or along the DNA molecule. The structure of the operon is shown in Figure 35.4.
Figure 35.4: Organization of the lac operon
The presence of lactose leads to induction of the operon because the lactose gives rise to allolactose, a
metabolite, which is produced from the occasional transglycosylation of lactose by β-galactosidase.
Allolactose serves its effector function by binding the lactose repressor protein, the gene product of lacI,
thereby preventing it from binding the operator sequence of the regulated DNA. When regulated in
this way, the lac operon is referred to as an inducible system with allolactose as the inducer of
transcription. Lac operon induction and repression are shown in Figure 35.5.
Figure 35.5: 1. RNA polymerase 2. lac repressor protein 3. promoter 4. operator 5.
lactose 6. lacZ 7. lacY 8. lacA (A) The operon is in its “off” state, because allolactose is
unavailable to bind the lac repressor protein and decrease the protein’s affinity for the
operator. In the absence of allolactose-repressor binding, the lac repressor binds the
operator and obstructs transcription of the operon’s structural genes. (B) The lac operon
is in its “on” state. Allolactose, produced only when intracellular lactose is available,
binds to and decreases the affinity of the lac repressor protein for the operator. RNA
polymerase is able to associate with the promoter region of the operon and transcribe the
structural genes of the operon.
605
Next Step MCAT Content Review: Biology and Biochemistry
II. Effectors and Repressors
In inducible systems such as the lac operon, binding of a low-molecular-weight compound, known as
an effector, to a particular repressor protein changes the conformation of the repressor. This change
reduces its affinity for the operator and allows transcription to proceed. In the lac operon, allolactose is
an effector molecule. In general, when the levels of effector are reduced, the repressor protein can bind
to the operator region for which it is specific, re-establishing the “off” state. In many cases, an operator
is specific for its particular operon; however, there are many examples of different operons with similar
operator sequences that are controlled by the same regulatory protein. The proteins encoded by these
operons are usually involved in related cellular processes.
III. Trp Operon
Another well-known example of prokaryotic transcriptional control is the trp operon, which modulates
the bacterial biosynthesis of tryptophan from its precursor molecule in E. coli. Unlike the lac operon,
which is an inducible system, the trp operon is a repressible system. The primary difference between a
repressible and an inducible system is the result that occurs when the effector molecule binds to the
repressor. In an inducible system, binding of the effector molecule to the repressor causes a
conformational change in the repressor which greatly decreases the repressor protein’s affinity for the
operator. Such action causes transcription of the operon to increase. In a repressible system, binding
by an effector molecule to the repressor gives rise to a conformational change that greatly increases the
repressor protein’s affinity for the operator, thereby causing transcription of the operon to cease.
The trp operon contains five structural genes: trpE, trpD, trpC, trpB, and trpA. These encode the
enzyme tryptophan synthetase, as well as several other enzymes involved in the pathway by which
tryptophan is synthesized from its precursor molecule, chorismic acid. The repressor of the trp operon
is produced upstream of the promoter by the constitutive, low-level expression of the trpR gene. In the
absence of tryptophan, the repressor, a tetrameric protein, is inactive. When tryptophan is present, it
binds to the repressor tetramer, causing a change in the repressor’s conformation that allows for the
association of the repressor with the operator region – located wholly within the operon’s promoter –
where it interferes with transcription. Effector molecules that serve as the activating ligand of repressor
proteins are known as corepressors. Their binding to the inactive repressor causes it to undergo a
conformational change, enabling the repressor-corepressor complex to bind to its corresponding
region and inactivate transcription of the structural genes of the operon. In general, the repressor
alone cannot bind the operator; when the concentration of the corepressor decreases, the “on” state
for the operon resumes. In the case of the trp operon, tryptophan acts as a corepressor for its own
606
Chapter 35: Control of Gene Expression in Prokaryotes
biosynthesis, providing a mechanism for feedback regulation in the synthesis of tryptophan. The
structure and regulation of the trp operon is shown in Figure 35.6.
Figure 35.6: Structure of the trp operon
IV. Attenuation
Another means by which the trp operon is regulated is attenuation. While the mechanism of repressible
expression responds to changes in intracellular tryptophan concentration, attenuation is responsive to
changes in the concentration of charged tRNA. Rather than decreasing gene expression by altering
the initiation of transcription, attenuation affects the process of transcription after it has begun.
One element of the operon is the leader sequence contained within the trpL gene, just
upstream of the first structural gene of the operon, trpE. This sequence contains four domains,
numbered 1-4, which are each partially complementary to one another. Domain 3 of the mRNA
synthesized from the gene can base pair with either domain 2 or domain 4. If domains 3 and 4 pair, a
stem-and-loop structure forms, interrupting further transcription. As discussed previously, such a
transcription termination sequence is rich in guanine and cytosine, and is followed by several uracil
residues which form weaker hydrogen bonds with adenine residues. Once the structure is formed,
RNA polymerase dissociates from the template DNA strand and the structural genes of the operon are
not transcribed. This 3-4 pairing occurs when the level of intracellular tryptophan is high. When
domains 2 and 3 pair, the stem and loop structure does not form and the enzymes required for
tryptophan biosynthesis are produced. This takes place when tryptophan levels within the cell are low.
The leader sequence codes for two adjacent tryptophan residues. When tryptophan levels are
low, the ribosome stalls while awaiting the delivering of a rare tryptophan-charged tRNA molecule.
While it is stalled, the ribosome obstructs the 1 domain of the transcript, preventing the formation of a
1-2 secondary structure. Domain 2 then is free to hybridize with domain 3. Formation of this 2-3
607
Next Step MCAT Content Review: Biology and Biochemistry
secondary structure prevents formation of the 3-4 termination structure, as domain 3 is no longer
available to pair with domain 4. This mechanism of transcriptional attenuation is shown in Figure
35.7.
Figure 35.7: Mechanism of transcriptional attention of the trp operon. (A) Formation of
the stem-loop termination structure occurs when the intracellular tryptophan
concentration is high. (B) Formation of an alternate stem-loop structure, which occurs
when the intracellular tryptophan concentration is low, permits the continuation of
transcription.
D. POSITIVE CONTROL OF TRANSCRIPTION
The central principle of gene regulation exemplified by the lactose operon is that control of
transcription is mediated by the interaction of regulatory subunits with DNA sequences. This general
mode of regulation is applicable to both prokaryotic and eukaryotic cells. Regulatory elements like the
operator are called cis-acting control elements because they affect the expression of only linked genes
608
Chapter 35: Control of Gene Expression in Prokaryotes
on the same DNA molecule. Conversely, proteins like the repressor are known as trans-acting factors
because they can affect the expression of genes located on other chromosomes within the cell. Binding
of the lactose repressor is an example of a trans-acting factor involved in negative control. This,
however, is not universally true; many trans-acting factors are activators, rather than inhibitors, of
transcription.
The best-studied mechanism of positive control in E. coli involves the effect of glucose on the
expression of genes that encode enzymes involved in the catabolism of sugars, including lactose, that
provide alternative sources of carbon or energy for the bacterial cell. When glucose is present, it is
utilized preferentially by E. coli, and as long as glucose is available, enzymes involved in the catabolism
of other sugars are not expressed. As an example, when E. coli are grown in a medium containing both
glucose and lactose, the lac operon is not induced and only glucose is metabolized by the cell. This is
because glucose acts to repress the lac operon, even in the presence of its normal inducer, lactose.
Repression by glucose, an example of what is called catabolite repression, is now known to be
mediated by a positive control system coupled to the levels of cyclic AMP (cAMP). In bacteria, the
enzyme adenylyl cyclase, which converts ATP to cAMP, is regulated in such a way that when glucose
levels decline, cAMP concentration increases. cAMP acts to bind a transcriptional regulatory protein
called catabolite activator protein (CAP). The binding of cAMP to CAP permits binding of the cAMPCAP complex to its target DNA sequences, which in the lac operon are approximately 60 base pairs
upstream of the transcription start site. CAP then interacts with the α subunit of RNA polymerase,
facilitating the binding of polymerase to the promoter and activating transcription.
I. Dual Regulation of Carbohydrate Metabolism in E. coli
The combined regulation of lactose metabolism, by both the negative inducible and positive inducible
systems, causes the enzymes of lactose metabolism to be made in small quantities when both glucose
and lactose are present. The presence of glucose suppresses expression of CAP; when this occurs,
expression of the lac operon is due solely to the binding and inhibition of the lactose repressor protein
by lactose. Thus, when both lactose and the preferred carbon source, glucose, are present, there is little
expression of the enzymes of lactose catabolism. What expression does occur is known as “leaky
expression”; this provides for a basal level of catabolic enzymes to help process lactose when cellular
glucose is expended, but before lac operon expression is fully activated. Figure 35.8 summarizes the
expression of the lac operon under different environmental conditions.
609
Next Step MCAT Content Review: Biology and Biochemistry
Figure 35.8: Structure of the lac operon (above), and lac operon gene expression under
different environmental conditions (below). P is the promoter region; O is the operator.
610
Chapter 35: Control of Gene Expression in Prokaryotes
Chapter 35 Problems
Passage 35.1 (Questions 1-6)
Mutant strains of E. coli that are deficient in regulation of the genes involved in lactose metabolism
were studied. These mutants were of two types. Constitutive mutants expressed all genes normally
involved in lactose metabolism, even when lactose was unavailable, while noninducible mutants failed
to express these genes in the presence or absence of lactose. Genetic mapping localized these
regulatory mutants to two distinct loci, called o and i, with o located immediately upstream of the
structural gene for β-galactosidase, z. Mutations affecting o resulted in constitutive expression;
mutations of i were either constitutive or noninducible.
The function of these regulatory genes was investigated by experiments in which two strains of
bacteria were mated, resulting in diploid cells containing genes from both parents. The inducibility of
z upon addition of allolactose to the medium containing the diploid offspring was observed. The
matings, as well as the inducibility of z in the offspring, are shown in Figure 1.
611
Next Step MCAT Content Review: Biology and Biochemistry
Figure 1: Regulation of β-galactosidase expression in diploid E. coli mutants. (Note: i+ and o+ are
normal; i- and oc are mutant alleles; z1 and z2 are structural gene mutations with normal function.)
From these experiments, the lac operon model was proposed in which i encodes a repressor protein
and o functions as an operator.
1. Which of the following conclusions is LEAST supported by the results of the experiments in which
mutations in o and i were combined with different mutations in the structural genes?
A. In an oc/o+ cell, only structural genes that are physically linked to both oc and o+ are constitutively
expressed.
B. In an oc/o+ cell, only structural genes that are physically linked to oc are constitutively expressed.
C. In an i+/i- cell, structural genes located on the same chromosome as i+ are normally expressed.
D. In an i+/i- cell, structural genes located on the same chromosome as i- are normally expressed.
2. In terms of the lac operon model, which of the following is true of the i gene identified in the
passage?
A. Allolactose is unable to bind to some of the repressor protein synthesized in i+/i- cells.
B. i+/i- cells synthesize sufficient functional repressor protein to display normal inducibility.
C. The i+ gene encodes a nonfunctional repressor protein.
D. The i allele encodes the operator region of the operon.
612
Chapter 35: Control of Gene Expression in Prokaryotes
3. Which of the following is most likely to describe the oc mutant in cells containing only a single,
normal z allele?
A. It is dominant to o+ and is a cis-acting control element.
B. It is dominant to o+ and is a trans-acting control element.
C. It is recessive to o+ and is a cis-acting control element.
D. It is recessive to o+ and is a trans-acting control element.
4. Molecular analysis has identified several changes in the nucleotide sequence of oc when compared
to the sequence of o+. This sequence change is most likely to affect the binding of oc by:
A. bacterial RNA polymerase.
B. a repressor.
C. DNA polymerase.
D. an inducer.
5. Experimenters conducting the studies described in the passage concluded that the mutant i allele is
recessive to the wild type. Which of the following explanations is most consistent with this conclusion?
A. The copy of the lac operon adjacent to the defective i gene is activated by the protein product of
the wild-type i gene.
B. The genotype of a cell carrying one mutant and one wild-type operator site permits synthesis of lac
structural genes.
C. Haploid cells containing the wild-type i gene are inducible.
D. The repressor is a small protein that can diffuse within the cell and inactivate expression of
structural genes linked to i-.
6. Researchers developed a method to select for regulatory mutants by mating haploid strains
carrying two complete wild-type copies of the operon and the operon’s regulatory elements. Each
strain contained one defect, a single copy of either i- or oc. Which of the following correctly describes
the diploid offspring?
A. Repressor mutants display normal inducibility.
B. Repressor mutants display the same phenotype as diploids homozygous for mutant operator
alleles.
C. A mutation in one copy of the operator gene results in constitutive expression of an operon.
613
Next Step MCAT Content Review: Biology and Biochemistry
D. A mutation in one copy of the operator gene confers the same inducibility phenotype as a
mutation in one copy of the repressor gene.
The following questions are NOT based on a descriptive passage.
7. The cya gene in E. coli encodes the enzyme adenylate cyclase, which produces cAMP. In a cya
mutant where intracellular cAMP concentration is decreased, which of the following is correct?
I. Expression of lacY is reduced.
II. Less catabolite activator protein will be found as a complex bound to the lac promoter.
III. Activation of the lac operon will be enhanced.
A. I only
B. II only
C. I and II only
D. I, II and III
8. The repressor in the trp operon:
A. is part of an inducible feedback mechanism.
B. is produced downstream of the trpR gene.
C. is constitutively expressed at a low level.
D. is in its active state unless bound by tryptophan.
9. Rifampin is a bactericidal antibiotic used in the treatment of tuberculosis. Its mechanism of action
involves the binding and inactivation of the β subunit of bacterial DNA-dependent RNA polymerase.
Which of the following actions in bacterial transcription is LEAST likely to be affected by disrupting β
subunit function?
A. formation of the channel in the RNA polymerase holoenzyme through which the DNA template
strand passes
B. catalysis of NTP addition to the RNA transcript
C. Rho-independent termination
D. identification of the transcription start site
614
Chapter 35: Control of Gene Expression in Prokaryotes
10. Which statement correctly relates the regulatory responses of E. coli to changes in intracellular
tryptophan levels?
A. Attenuation regulates initiation of trp operon structural gene expression.
B. Formation of the stem-loop transcription termination signal occurs when tryptophan
concentration is low.
C. Attenuation is responsive to changes in the intracellular concentration of tRNATrp.
D. Association between domains 2 and 3 of the trp leader sequence forms the stem-loop termination
structure.
615
Next Step MCAT Content Review: Biology and Biochemistry
Chapter 35 Solutions
1. A.
Figure 1 shows that of the two diploid offspring, structural genes in the oc/o+ cell are constitutively
expressed if and only if the gene is physically linked to oc. This is seen in the constitutive expression of
z2, which is located on the same chromosome as, and structurally linked to, only oc. The constitutive or
inducible expression of oc and oi parent bacteria, respectively, further support this observation. This
contradicts the statement in choice A, the correct answer, and is consistent with choice B. Choices C
and D are also true statements. In i+/i- diploids, structural genes physically linked to either i+ or i- are
inducible.
2. B.
The gene i is responsible for the synthesis of a repressor protein. According to the passage, mutations
affecting i gave rise to either constitutive or noninducible expression of the operon. In cases where a
mutation in i gives rise to a repressor protein that is unable to bind the operator, structural genes of the
operon will be constitutively expressed. Conversely, if a mutation in i gives rise to a repressor protein
that can bind the operator, but can’t be bound by its effector, allolactose, expression of structural genes
of the operon cannot be induced. The constitutive expression of structural genes in the i- mutant
shown in Figure 1 shows that the mutation must be one that prevents binding of the repressor protein
to the operator. Since structural genes in i+/i- diploid offspring demonstrate an inducible phenotype, a
single wild-type copy of the gene must provide sufficient functional repressor to normally bind the
operator sequence of both genes, giving rise to normal operon inducibility in the offspring. This is
choice B. Choice A is contradicted by the previous discussion—the mutation i- appears to affect
binding of the repressor to operator, not of allolactose to repressor. Choices C and D are also
incorrect. The i+ gene encodes a functional repressor protein.
3. A.
Figure 1 shows that expression of z2 does constitutively occur in oc/o+ diploids (albeit at a lower level
compared to wild-type homozygotes). This indicates that oc is dominant to o+. Furthermore, regulatory
elements like the operator are called cis-acting control elements, because they affect the expression of
only linked genes on the same DNA molecule. Choice A, then, is the correct choice. Conversely,
proteins like the repressor are known as trans-acting factors, because they can affect the expression of
genes located on other chromosomes within the cell.
616
Chapter 35: Control of Gene Expression in Prokaryotes
4. B.
Figure 1 shows that the expression of structural genes in the i- mutant is constitutive. Thus, the
mutation must be one that prevents binding of the repressor protein to the operator. This is choice B.
5. D.
The inducible phenotype of structural genes in i+/i- diploid offspring demonstrates that a single wildtype copy of the gene provides sufficient functional repressor to diffuse within the cell and bind the
operator sequence of both genes. This gives rise to normal operon inducibility in the offspring and is
consistent with choice D. While choice A is nearly a correct statement, it incorrectly indicates that lac
operon expression is activated, rather than inactivated, by the protein product of the wild-type i gene.
Choice B, while a true statement, doesn’t draw a distinction between inducible and constitutive
expression of the lac operon, both of which can occur depending upon the genotype of the cell—
expression of structural genes will occurs in both cases. Choice C, while also a true statement, concerns
expression in haploid cells and doesn’t relate expression patterns when both wild type and i- are
present.
6. C.
The discussion of previous questions explained that in diploid mutants, all structural genes linked to a
wild-type operator sequence are expressed when a single wild-type copy of i+ is available. Strains
carrying two copies of the operon’s regulatory elements will express normal inducibility if their single
defect is one copy of i-. This eliminates choices A and B. Regardless of the presence of functional
repressor protein, carrying a copy of oc will result in the constitutive expression of the structural genes
linked to that operator sequence. This is choice C. Choice D is false; a mutation in one copy of the
operator gene will result in the inducible expression of one operon and the constitutive expression of
the other. If only a single copy of the repressor allele is mutated, both operons will show normal
inducibility.
7. C.
In bacteria, the enzyme adenylyl cyclase, encoded by cya, converts ATP to cAMP. This process is
regulated in such a way that when glucose levels decline, cAMP concentration increases. cAMP acts to
bind a transcriptional regulatory protein called catabolite activator protein (CAP). The binding of
cAMP to CAP permits binding of the cAMP-CAP complex to its target DNA sequences. CAP then
interacts with the α subunit of RNA polymerase, facilitating the binding of polymerase to the promoter
and activating transcription of the operon structural genes. In the absence of adenylate cyclase,
encoded by cya, cAMP levels will not rise; CAP will be unable to bind and promote transcription of lac
operon structural genes, including that of lacY. Roman numeral I is then true, as is Roman numeral II.
617
Next Step MCAT Content Review: Biology and Biochemistry
However, activation of the lac operon will be reduced rather than enhanced as a consequence of these
changes. Roman numeral III is thus false. This is choice C.
8. C.
The repressor in the trp operon is constitutively expressed at a low level by a gene upstream of the
promoter. Choice C, then, is correct and choice B is incorrect. The trp operon is a classic example of a
repressible (i.e. default on) system (eliminating choice A). In a repressible system, binding by an
effector molecule to the repressor protein gives rise to a conformational change that greatly increases
the repressor’s affinity for the operator, thereby causing transcription of the operon to cease. For this
reason, the repressor is in its inactive state unless bound by its co-repressor, tryptophan. Choice D is
therefore incorrect.
9. D.
The core RNA polymerase subunit does not bind specifically to the DNA sequences that direct the
incoming polymerase enzyme to its correct position relative to the start site of transcription. The σ
subunit, which is weakly bound to the core enzyme, serves this purpose and is needed for the correct
identification of the transcriptional start site. This process is independent of the function of the core
enzyme, which includes the β subunit. Choice D is the correct answer. During elongation, the β and β’
subunits form a structure that maintains the association between RNA polymerase and the DNA
template, while also forming a channel through which the template DNA strand passes and in which
the polymerase active site can be found. This eliminates choices A and B. During Rho-independent
termination, interaction of the hairpin terminator with the transcription factor NusA is mediated by
the polymerase β subunit; this interaction disrupts the association between RNA polymerase and the
DNA template at the β subunit, causing the termination of transcription. Choice C can be eliminated.
10. C.
Attenuation is responsive to changes in the concentration of charged tRNA, specifically that of
tRNATrp. This differs from the action of the repressor protein in the mechanism of repressible
expression, which responds to changes in intracellular tryptophan concentration. Choice C is correct.
Attenuation does not alter the initiation of operon transcription; it affects the process of transcription
once it has begun. Choice A is incorrect. Formation of the stem-loop transcription termination signal
by the association of domains 3 and 4 inhibits transcription of the trp operon, and occurs when
tryptophan concentration is low and the enzymes required for tryptophan synthesis aren’t needed by
the cell. Choices B and D are incorrect.
618
Next Step MCAT Content Review: Biology and Biochemistry
CHAPTER 36
Eukaryotic Chromosome Organization and Control of Gene
Expression
A. INTRODUCTION
The basic mechanism of transcription is common to prokaryotes and eukaryotes. However, the
specific processes in eukaryotes are considerably more complex than those in bacteria and other
prokaryotes. This can be seen in three important differences between the prokaryotic and eukaryotic
transcriptional machinery:
• In bacteria, all genes are transcribed by a single RNA polymerase; in eukaryotes, more than one
type of RNA polymerase is responsible for the transcription of genes.
• Eukaryotic RNA polymerases require additional proteins to initiate and regulate transcription.
• Transcription in prokaryotes occurs on free DNA. In eukaryotes, transcription occurs on
chromatin, and regulation of chromatin structure strongly influences the transcriptional activity
carried out on eukaryotic genes.
The increased sophistication of regulatory options available in the eukaryotic cell likely evolved to
facilitate the complexity of directing the activity of the many different cell types of multicellular
organisms.
B. EUKARYOTIC RNA POLYMERASES
Eukaryotic cells contain three distinct nuclear RNA polymerases that transcribe different types of
genes. RNA polymerase I transcribes only the three largest species of ribosomal RNA (rRNA), which
are designated 28S, 18S and 5.8S, according to their rates of sedimentation during centrifugation.
RNA polymerase II encodes protein-coding genes, producing messenger RNA (mRNA), as well as
rRNA, transfer RNA (tRNA) and microRNAs, which are regulators of both transcription and
translation in eukaryotic cells. Some of the small RNAs involved in splicing and protein transport,
700
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
abbreviated as snRNA and scRNA, respectively, are transcribed by RNA polymerase II, while others
are transcribed by RNA polymerase III. Additionally, RNA polymerase III transcribes genes encoding
tRNA, as well as the smallest species of rRNA, 5S rRNA. RNA polymerases, which structurally
resemble bacterial RNA polymerases, specifically transcribe the organellar DNA in chloroplasts and
mitochondria. The classes of genes transcribed by eukaryotic RNA polymerases are summarized in
Table 36.1.
RNA synthesized
RNA polymerase
mRNA
II
tRNA
III
5.8S, 18S, 28S rRNA
I
5S rRNA
III
snRNA, scRNA
II and III
Mitochondrial genes
Mitochondrial
Chloroplast genes
Chloroplast
Table 36.1: Genes transcribed by RNA polymerases
All three eukaryotic RNA polymerases contain nine highly conserved subunits, five of which are
related to the core subunits of bacterial RNA polymerase. The structural homology of the core
catalytic subunits of eukaryotic and bacterial RNA polymerases suggest that all RNA polymerases
utilize similarly conserved mechanisms of transcription.
C. DNA BINDING PROTEINS AND GENERAL TRANSCRIPTION FACTORS
Specific proteins, called transcription factors, are required for RNA polymerase II to initiate
transcription. General transcription factors are involved in transcription from all polymerase II
promoters. For this reason, they represent a basic component of eukaryotic transcription. Additional
gene-specific transcription factors, discussed later in this chapter, bind to DNA sequences that control
expression of individual genes and are thus responsible for the regulation of gene expression. The
promoters of many genes transcribed by polymerase II contain a sequence similar to TATAA located
25 to 30 nucleotides upstream of the transcription start site. This sequence, referred to as the TATA
box, resembles the -10 sequence of bacterial promoters. The promoters of many genes transcribed by
RNA polymerase II also contain a second sequence element, called the initiator (Inr) sequence, which
701
Next Step MCAT Content Review: Biology and Biochemistry
spans the transcription start site. While many promoters bound by RNA polymerase II contain both of
these elements, some contain only a TATA box and others contain only an Inr element. A large
number of those promoters that lack a TATA box but contain an Inr element also contain an
additional downstream promoter element (DPE) approximately 30 base pairs downstream of the
transcription site; this functions cooperatively with the Inr sequence.
I. General Transcription Factors and Transcriptional Initiation
The first step in the formation of a transcription complex is the binding of a general transcription
factor called TFIID to the promoter. TFIID is itself composed of multiple subunits, including the
TATA-binding protein (TBP) and at least 14 other polypeptides, called TBP-associated factors (TAFs).
TBP binds specifically to the TATA box while other TAF subunits of TFIID appear to bind the Inr
and DPE sequences. The binding of TFIID is followed by the binding of a second general
transcription factor, TFIIB. In addition to TBP, this factor also binds to a DNA sequence upstream of
the TATA box, known as the B recognition element (BRE). TFIIB in turn serves as a bridge to RNA
polymerase II, which binds to the TBP-TFIIB complex in association with a third factor, TFIIF.
Following recruitment of RNA polymerase II to the promoter, the binding of two additional factors
(TFIIE and TFIIH) completes formation of the initiation complex. TFIIH is a multisubunit factor that
plays at least two roles. First, XPB and XPD (two subunits of TFIIH which are also required for
nucleotide excision repair) act as helicases, unwinding DNA around the initiation site. Another subunit
of TFIIH is a protein kinase that phosphorylates tandem repeated sequences present in the C-terminal
domain (CTD) of the largest subunit of RNA polymerase II. Phosphorylation of these amino acids
releases the polymerase from its association with the preinitiation complex. Phosphorylation is further
responsible for recruitment of other proteins that allow the polymerase to initiate synthesis. The
sequential recruitment of these five general transcription factors (TFIID, TFIIB, TFIIF, TFIIE, and
TFIIH), shown assembled as the pre-initiation complex in Figure 36.1, and RNA polymerase II
represent the minimal requirements for transcription to begin in vivo.
Figure 36.1: The transcriptional pre-initiation complex (PIC) assembled along a
template DNA strand
702
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
Within the cell, additional factors are required, including a large, multi-subunit protein complex called
Mediator, which stimulates basal transcription and plays a role in linking the general transcription
factors to the gene-specific transcription factors that regulate transcription.
II. Transcription of RNA polymerases I and III
RNA polymerase I transcribes only the genes that encode rRNA. Transcription of these genes yields a
large 45S pre-RNA, which is then processed to yield the 28S, 18S, and 5.8S rRNAs. The promoters of
ribosomal RNA genes are recognized by two transcription factors, upstream binding factor (UBF) and
selectivity factor 1 (SL1). The SL1 transcription factor is composed of four subunits, one of which is
TBP. Since the promoters of rRNA do not contain a TATA box, TBP does not bind specific promoter
sequences. Instead, the association of TBP with rRNA is mediated by the binding of other proteins in
the SL1 complex to the promoter. This is roughly analogous to the association of Inr sequences of
polymerase II genes that lack TATA boxes. Assembly of the RNA polymerase I pre-initiation complex
is shown in Figure 36.2.
Figure 36.2: Assembly of the RNA polymerase I (Pol I) pre-initiation complex
(PIC) involves the synergistic action of upstream binding factor (UBF) and
promoter selectivity factor SL1, which consists of the TATA-binding protein
(TBP) and three TBP-associated factors (TAFs). The interaction of transcription
initiation factor (TIFIA) and SL1 is essential for recruitment of Pol I.
The genes for tRNAs, 5S RNA, and some other small RNAs involved in splicing and protein transport
are transcribed by RNA polymerase III. These genes are transcribed from three distinct classes of
promoters, two of which lie within, rather than upstream of, the transcribed sequence. TFIIIA initiates
assembly of a transcription complex by binding to specific DNA sequences in the 5S rRNA promoter.
This binding is followed by the sequential binding of TFIIIC, TFIIIB, and polymerase III. This is
shown in Figure 36.3.
703
Next Step MCAT Content Review: Biology and Biochemistry
Figure 36.3: Most RNA polymerase III-transcribed genes have internal promoters
within the transcribed region, which are recognized by the large, five-subunit
factor TFIIIC. In turn, TFIIIC recruits TFIIIB, which is composed of TBP and the
TAFs BRF1 and BDP1. TFIIIB recruits Pol III and assists it in initiating
transcription.
The promoters of the tRNA genes differ from those of 5S rRNA in that they do not contain the DNA
sequences recognized by TFIIIA. Instead, TFIIC binds directly to the promoter of tRNA genes,
serving to recruit TFIIIB and polymerase to form a transcription complex. Promoters of the third class
of genes transcribed by polymerase III, including genes encoding some of the snRNAs involved in
splicing, are located upstream of the transcription start site. These promoters contain a TATA box
(like those of polymerase II genes) as well a binding site for another factor called SNAP. SNAP and
TFIIIB bind cooperatively to these promoters, with TFIIIB binding directly to the TATA box. This is
mediated by TBP, a subunit of TFIIIB, which then recruits the polymerase to the transcription
complex, as shown in Figure 36.4.
Figure 36.4: The promoter of the U6 snRNA gene is located upstream of the
transcription start site. It contains a TATA box, which is recognized by the TATAbinding protein (TBP) subunit of TFIIIB in cooperation with another factor called
SNAP.
704
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
D. TRANSCRIPTIONAL REGULATION
The expression of eukaryotic genes is controlled primarily at the level of transcriptional initiation,
although it can also be regulated during elongation. As in bacteria, transcription in eukaryotes is
controlled by proteins that bind to specific regulatory sequences and modulate the activity of RNA
polymerase. An important difference between transcriptional regulation in eukaryotes and prokaryotes
results from the packaging of eukaryotic DNA into chromatin, limiting its availability to the
transcriptional machinery of the cell. As a result, modification of chromatin structure plays a central
role in the control of transcription in eukaryotic cells. Furthermore, a large and evolving body of
research indicates that noncoding RNAs, as well as proteins, regulate transcription in eukaryotic cells
via modifications in chromatin structure.
I. cis-Acting Regulatory Elements: Promoters and Enhancers
As discussed in chapter 35, transcription in bacteria is regulated by the binding of proteins to
cis-acting sequences, such as the lac operator, that control the transcription of adjacent genes. Similar
cis-acting sequences regulate expression in eukaryotes. Genes transcribed by RNA polymerase II have
core promoter elements, including the TATA box and the Inr sequence, that serve as specific binding
sites for general transcription factors. Other cis-acting sequences serve as binding sites for a wide
variety of regulatory factors that control the expression of individual genes. These cis-acting sequences
are frequently (although not always) located upstream of the TATA box. For example, two regulatory
elements that are found in many eukaryotic genes were first identified by studies of the promoter
region of the herpes simplex virus genes that encodes thymidine kinase. Both of these sequences are
located within 100 upstream base pairs of the TATA box. Their consensus sequences are CCAAT and
GGGCGG (called a GC box). Specific proteins that bind to these sequences and stimulate
transcription have since been identified.
In contrast to the relatively simple organization of CCAAT and GC boxes, many genes in
mammalian cells are controlled by regulatory sequences located farther from the transcription start
site. Called enhancers, these sequences were identified during studies of the promoter of another virus,
SV40. In addition to a TATA box and a set of GC boxes, two 72-base-pair repeats located farther
upstream are required for efficient transcription from this promoter. Enhancers, like promoters,
function by binding transcription factors that then regulate RNA polymerase. This is possible due to
DNA looping, which allows a transcription factor bound to a distant enhancer to interact with proteins
associated with the RNA polymerase/Mediator complex at the promoter. This is depicted in Figure
36.5.
705
Next Step MCAT Content Review: Biology and Biochemistry
Figure 36.5: Transcription factors bound at distant enhancers are able to interact with the
RNA polymerase II/Mediator complex at the promoter. Because the intervening DNA
can form loops, no fundamental difference exists between the action of transcription
factors bound to DNA just upstream of the gene in the promoter and that of distant
enhancers.
Transcription factors bound to distant enhancers can thus function by the same mechanism as those
bound adjacent to promoters. Thus, there is no fundamental difference between the actions of
enhancers and those of cis-acting regulatory sequences adjacent to transcription start sites.
Enhancers can function not only over long distances, but sometimes even from different chromosomes.
This process, termed transvection, is most likely to show up on the MCAT in the context of a passage
regarding the model system in which it was first elucidated, Drosophila. Transvection is so named
because it involves trans-acting enhancers from one gene regulating the expression of the gene’s
homolog on a separate chromosome.
The binding of specific transcriptional regulatory proteins to enhancers is responsible for the
control of gene expression during development and differentiation. An example of a well-studied
enhancer is that which controls the transcription of immunoglobulins in B lymphocytes. Gene transfer
experiments have demonstrated that the immunoglobulin enhancer is active in lymphocytes, but not
in other cell types. Thus, this regulatory sequence is at least partly responsible for the tissue-specific
expression of the immunoglobulin genes in the appropriate cell type.
Importantly, enhancers usually contain multiple functional sequence elements that bind
different transcriptional regulatory proteins. These proteins work together to regulate gene expression.
Returning to the previous example, the immunoglobulin heavy-chain enhancer spans more than 200
base pairs and contains at least nine distinct sequence elements that serve as protein-binding sites. The
706
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
mutation of any of these sequences reduces, but does not abolish, enhancer activity, indicating that the
functions of individual proteins that bind to the enhancer are at least partially redundant. In nonlymphoid cells, many of the individual sequence elements of the immunoglobulin enhancer are able to
stimulate transcription by themselves. The restricted activity of the intact enhancer in B lymphocytes
therefore does not arise from tissue-specific functions of each of its components. Instead, tissue-specific
expression results from the combination of the individual sequence elements that make up the
complete enhancer. These elements include some cis-acting regulatory sequences that bind
transcriptional activators that are expressed specifically in B lymphocytes, as well as other regulatory
sequences that bind repressors in non-lymphoid cells. Accordingly, the immunoglobulin enhancer
contains negative regulatory elements that inhibit transcription in inappropriate cell types, as well as
positive regulatory elements that activate transcription in B lymphocytes. The overall activity of the
enhancer is greater than the sum of its parts, reflecting the combined action of the proteins associated
with each of the individual sequence elements.
Although DNA looping allows enhancers to act over a considerable distance from the
promoters, the activity of any given enhancer is specific for the promoter of its appropriate target gene.
This specificity is maintained in part by insulators or barrier elements, which divide chromosomes into
independent domains and prevent enhancers from acting on promoters located in an adjacent
domain. Insulators also prevent the chromatin structure of one domain from expanding into the
region occupied by a neighboring chromatin structure, thereby maintaining independently regulated
regions of the genome. It is thought that insulators function separately by organizing independent
domains of chromatin within the nucleus, but their mechanism of action remains a subject of ongoing
research. One potential application of insulator elements relates to gene therapy, where a major
hurdle is preventing the aberrant regulation or inactivation of an introduced gene by the nearby
chromatin structure. The addition of an insulator element is a potential solution to this problem.
II. Transcription Factor Binding Sites
The binding sites of transcriptional regulatory proteins in promoter or enhancer sequences have
commonly been identified by two types of experiments. The first, footprinting, was originally
developed to characterize the binding of RNA polymerase to prokaryotic promoters. In experiments of
this type, shown in Figure 36.6, a DNA fragment is radiolabeled at one end. The labeled DNA is
incubated with the protein of interest (e.g., RNA polymerase) and then subjected to partial digestion
with DNase.
707
Next Step MCAT Content Review: Biology and Biochemistry
Figure 36.6: DNA footprinting. 1. A sample containing fragments of DNA is radiolabeled
at once end. 2. The sample is then divided in two. Half is incubated with a protein that
binds to specific sequences of DNA within the fragment, while the remainder is
unchanged. 3. Both samples are then digested with DNAase, under conditions such that
the DNAase introduces an average of one cut per molecule. The region of DNA bound to
the protein is protected from DNAase digestion. The DNA-protein complexes are then
denatured, and the sizes of the radiolabeled DNA fragments produced by DNAase
digestion are analyzed by electrophoresis. Fragments of DNA resulting from DNAase
cleavage within the region protected by protein binding are missing from the sample of
DNA that was incubated with protein.
The principle of the method is that the regions of DNA to which the protein binds are protected from
DNase digestion. These regions can therefore be identified by comparison of the digestion products of
the protein-bound DNA with those resulting from identical DNase treatment of a parallel sample of
DNA that was not incubated with proteins. Variations of this method, which employ chemical
reagents that modify and cleave DNA at particular nucleotides, can be used to identify the specific
DNA bases that are in contact with protein.
A second approach involves performing an electrophoretic-mobility shift assay in which a
radiolabelled DNA fragment is incubated with a protein preparation and then subjected to
electrophoresis through a nondenaturing gel. Protein binding is indicated by a decrease in the
708
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
electrophoretic mobility of the DNA fragment, since the bound protein slows its migration through the
gel. The combined use of footprinting and electrophoretic mobility shifts has led to the correlation of
protein-binding sites with the regulatory elements of enhancers and promoters, indicating that these
sequences generally constitute the recognition sites of specific DNA-binding proteins.
The binding sites of most transcription factors consist of short, degenerate DNA sequences,
meaning that the transcription factor will bind not only to the consensus sequence but also to
sequences that differ from the consensus at one or more positions. Because of their short, degenerate
nature, sequences matching transcription factors occur frequently in genomic DNA, so physiologically
significant regulatory sequences cannot be identified using DNA sequences alone. Such identification
remains one of the primary challenges in molecular biology. One experimental approach is that of
chromatin immunoprecipitation (ChIP). Cells are first treated with formaldehyde, which cross-links
proteins to DNA. As a result, transcription factors are covalently linked to the DNA sequences to
which they were bound within the living cell. Chromatin is then extracted and sheared to fragments of
about 500 base pairs. Fragments of DNA linked to a transcription factor of interest can then be
isolated by immunoprecipitation with an antibody against the transcription factor. The formaldehyde
crosslinks are then reversed, and the immunoprecipitated DNA is isolated and analyzed to determine
the sites to which the specific transcription factor was bound within the cell. A ChIP sequencing
protocol is shown in Figure 36.7.
709
Next Step MCAT Content Review: Biology and Biochemistry
Figure 36.7: Chromatin immunoprecipitation (ChIP). 1. Sample cells are treated with a
reversible chromatin cross-linking agent. 2. DNA strands are sheared by sonication. 3.
Bead-attached antibodies are added to immunoprecipitate target transcription factor
protein. 4. DNA is unlinked, yielding purified DNA. 5. Purified DNA extract is
sequenced.
710
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
III. Transcriptional Regulatory Proteins
A wide range of transcriptional regulatory proteins have been isolated and identified based on their
specific binding affinity for select DNA sequences. Because transcription factors are central to the
regulation of gene expression in eukaryotes, they remain a major area of ongoing research in cellular
and molecular biology, and a frequently tested molecular genetics topic on the MCAT. Of these
proteins, the best understood are transcriptional activators, which bind to regulatory DNA sequences
and stimulate transcription. In general, these factors consist of two independent domains; one region
specifically binds to DNA, while the other stimulates transcription by interacting with other proteins,
including Mediator or other components of a cell’s transcriptional machinery. The basic function of
the DNA-binding domain is to anchor the transcription factor to the proper DNA site. The activation
domain then independently stimulates transcription through protein-protein interaction.
IV. Transcription Factors
More than 2500 transcription factors encoded by the human genome have thus far been recognized.
They contain a diversity of distinct DNA-binding domains, the most common of which is the zinc
finger domain. This type of domain consists of repeating cysteine and histidine residues that bind zinc
ions and fold into DNA-binding loop structures referred to as “fingers,” the structure of which is
shown in Figure 36.8.
711
Next Step MCAT Content Review: Biology and Biochemistry
Figure 36.8: Structural representation of the Cys2His2 zinc finger motif, consisting
of an α helix and an antiparallel β sheet. The zinc ion (center) is coordinated by
two histidine residues and two cysteine residues.
These domains were initially identified in the polymerase III transcription factor TFIIIA, but they are
also common among transcription factors that regulate polymerase II promoters, including Sp1, one
of the earliest identified and most common eukaryotic transcription factors. Other examples of
transcription factors that contain zinc finger domains are the steroid hormone receptors, which
regulate gene transcription in response to hormones such as estrogen and testosterone.
Artificial transcription factors with zinc-finger domains designed to bind specific sequences
within the genome have been developed. By ligating different effector domains to the DNA binding
domain, the target gene can be either activated or repressed. This technology can be applied to gene
therapy and the development of transgenic plants and animals of commercial interest.
The helix-turn-helix motif (Figure 36.9) was first recognized in prokaryotic DNA-binding
proteins, including the E. coli catabolite activator protein (CAP).
Figure 36.9: Basic helix-turn-helix structural motif. Two α-helices are connected by a
short loop.
In these proteins, one of the helices makes most of the contact with DNA, while the other lies across
the complex in order to stabilize the interaction. In eukaryotes, helix-turn-helix proteins include the
homeodomain proteins, which play essential roles in the regulation of gene expression during
embryonic development. Genes encoding these proteins were first identified in developmental mutants
of Drosophila. Some of the earliest recognized Drosophila mutants resulted in the development of flies in
712
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
which one body part was transformed into another. For example, in one homeotic mutant, legs rather
than antennae grow from the head of the fly. Genetic analysis has shown that these mutants contain
nine homeotic genes, each of which specifies a different body segment. Molecular cloning and analysis
of these genes indicate that they contain conserved sequences of 180 base pairs, called homeoboxes,
which encode DNA-binding domains (homeodomains) of transcription factors. A wide variety of
similar homeodomain proteins have since been identified in fungi, plants and other animals, including
humans.
Two other families of DNA-binding proteins – the leucine zipper and helix-loop-helix
proteins – contain DNA-binding domains formed by the dimerization of two polypeptide chains. The
leucine zipper contains four or five leucine residues spaced at intervals of seven amino acids, resulting
in their hydrophobic side chains being exposed at one side of a helical region. This region serves as the
dimerization domain for the two protein subunits, which are held together by hydrophobic
interactions between the leucine side chains. Immediately following the leucine zipper is a region rich
in positively charged lysine and arginine residues that bind DNA. This interaction is depicted in Figure
36.10.
Figure 36.10: Leucine zipper dimer bound to DNA fragment
The helix-loop-helix proteins are similar in structure, but differ in the exact structure of their
dimerization domains, formed by two helical regions separated by a loop. An important feature of
both leucine zipper and helix-loop-helix transcription factors is that different members of each family
can dimerize with one another. Thus, the combination of distinct protein subunits can form an
expanded array of factors that can differ in both DNA sequence recognition and transcription-
713
Next Step MCAT Content Review: Biology and Biochemistry
stimulating activity. This formation of dimers between different family members is a critical aspect of
their self-regulation. Both families of DNA-binding proteins play a central role in regulating tissuespecific and inducible gene expression.
The activation domains of transcription factors are not as well characterized as their DNAbinding domains. Some, called acidic activation domains, are rich in negatively charged residues such
as aspartate and glutamate; others are rich in proline or glutamine. The activation domains appear to
stimulate transcription by two distinct mechanisms. First, they interact with Mediator proteins and
general transcription factors, such as TFIIB or TFIID, to recruit RNA polymerase; in doing so, they
facilitate the assembly of a transcription complex on the promoter, similar to that which occurs in
transcriptional activators in bacteria. In addition, eukaryotic transcription factors interact with a
variety of coactivators that stimulate transcription by modifying chromatin structure, as discussed later
in this chapter.
Figure 36.11: Eukaryotic activators stimulate transcription by exerting two downstream
effects: 1) they interact with Mediator proteins and general transcription factors to
facilitate the assembly of a transcription complex and stimulate transcription, and 2) they
interact with coactivators that facilitate transcription by modifying chromatin structure.
V. Eukaryotic Repressors
Eukaryotic gene expression is regulated not just by transcriptional activators, but also by
repressors. Like their prokaryotic counterparts, eukaryotic repressors bind to specific DNA sequences
and inhibit transcription. In some cases, eukaryotic repressors act by interfering with the binding of
other transcription factors to DNA. For example, the binding of a repressor near the transcription start
site can block the interaction of RNA polymerase or general transcription factors with the promoter,
714
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
an action similar to that of bacterial repressors. Other repressors compete with activators for binding
to specific regulatory sequences. Some of these repressors contain the same DNA-binding domain as
the activator, but lack its activation domain. As a result, their binding to a promoter or enhancer
blocks the binding of the activator, inhibiting transcription. This is shown in Figure 36.12.
Figure 36.12: Some repressors block the binding site of activators to regulatory
sequences.
In contrast to repressors that simply interfere with activator binding, “active” repressors
contain specific functional domains that inhibit transcription via protein-protein interaction. Many
active repressors have been shown to play key roles in the regulation of transcription related to
proteins that control cellular growth and differentiation. As with transcriptional activators, several
distinct types of repressor domains have been identified. For example, the repressor domain of one of
the first eukaryotic repressor proteins to be identified, Krüppel, is rich in alanine residues, while other
repression domains are rich in proline or acidic residues. The functional targets of repressors are
equally diverse. Repressors can inhibit transcription by interacting with specific activator proteins,
with Mediator proteins, with general transcription factors, or with corepressors that act by modifying
chromatin structure. An example of one such mode of repression is seen in Figure 36.13.
715
Next Step MCAT Content Review: Biology and Biochemistry
Figure 36.13: Some repressors have active repression domains that inhibit transcription
by interactions with Mediator proteins or general transcription factors, as well as with corepressors that act to modify chromatin struture.
The regulation of transcription by both repressors and activators considerably extends the
range of mechanisms that control the expression of eukaryotic genes. One important role of repressors
may be to inhibit the expression of tissue-specific genes in inappropriate cell types. For example, as
noted earlier, a repressor binding site in the immunoglobulin enhancer is thought to contribute to its
tissue-specific expression by suppressing transcription in non-lymphoid cell types. Other repressors
play important roles in the control of cell proliferation and differentiation in response to signaling by
hormones and growth factors.
E. REGULATION OF CHROMATIN STRUCTURE
As referenced in the preceding discussion, both activators and repressors regulate transcription in
eukaryotes not only by interacting with Mediator and other components of the transcriptional
machinery, but also by inducing changes in the structure of chromatin. Rather than being present
within the nucleus as naked genetic material, the DNA of all eukaryotic cells is tightly bound to
histones. The basic structural unit of chromatin is the nucleosome, each of which consists of 147 base
pairs of DNA wrapped around the core histones H2A, H2B, H3, and H4. The core histones are
present in the nucleosome as an octamer containing four dimers, with one molecule of histone H1
bound to the DNA as it enters the nucleosome core particle. The organization of the nucleosome is
shown in Figure 36.14 and the formation of a histone core octamer is shown in Figure 36.15.
716
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
Figure 36.14: Nucleosome organization
Figure 36.15: Formation of the histone octamer
The chromatin is further condensed by being coiled into higher-order structures organized into large
loops of DNA. This packing of eukaryotic DNA in chromatin has significant consequences in terms of
the packaged DNA’s availability as a template for transcription; control over the state in which cellular
DNA is maintained represents an important regulatory tool in eukaryotic cells.
Actively transcribed genes are found in relatively de-condensed chromatin, roughly
corresponding to 30-nm fibers. Nonetheless, actively transcribed genes remain bound in a relatively
717
Next Step MCAT Content Review: Biology and Biochemistry
inaccessible state to histones and packaged in nucleosomes, presenting transcription factors and RNA
polymerase with the problem of interacting with DNA in a nucleosome structure. The tight winding of
DNA around the nucleosome core particle is a major obstacle to transcription, affecting both the
ability of transcription factors to bind DNA and the ability of RNA polymerase to transcribe through
the complex spatial arrangement of a chromatin template.
I. Histone Modification
Several modifications are characteristic of transcriptionally active chromatin, including histone
modifications, nucleosome rearrangements, and the association of two non-histone chromosomal
proteins, called HMGN proteins, with the nucleosomes of actively transcribed genes. The binding sites
of the HMGN proteins on nucleosomes overlap the binding site of histone H1, and it appears that
these proteins stimulate transcription by affecting modifications of histone H1 to maintain a
decondensed chromatin structure.
Figure 36.16: The binding of epigenetic factors to histone “tails” alters the extent to
which DNA is wrapped around histones and the availability of genes in the DNA to be
activated.
Histone acetylation has been correlated with transcriptionally active chromatin in a wide
variety of cell types. The core histones have two domains: a histone fold domain, which is involved in
the interaction with other histones and in wrapping DNA around the nucleosome core particle, and an
amino-terminal domain, which extends outside the nucleosome. The amino-terminal domains are rich
in lysine and can be modified by acetylation at specific lysine residues. Histone acetyltransferases have
been associated with a number of mammalian transcriptional coactivators, as well as with general
718
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
transcription factor TFIID. Conversely, transcriptional corepressors in both yeast and mammalian
cells function as histone deacetylases, which remove acetyl group from histone tails. Histone
acetylation is thus targeted directly by both transcriptional activators and repressors, indicating that it
plays a role in regulation of eukaryotic gene expression.
Histones are modified not only by acetylation, but also by phosphorylation of serine residues,
methylation of lysine and arginine residues, and addition of ubiquitin to lysine residues. Like
acetylation, these modifications occur at specific amino acid residues in the histone tails that are
associated with changes in transcriptional activity. Changes in gene expression brought about by
histone modification are likely a result of the creation of binding sites for other regulatory proteins.
According to this hypothesis, combinations of histone modifications constitute a “histone code” that
regulates gene expression by recruiting other regulatory proteins to the chromatin template. For
example, transcriptionally active chromatin is associated with several specific modifications of histone
H3; these include methylation of lysine-4, phosphorylation of serine-10, acetylation of lysine-9, -14, 18, and -23, and methylation of arginine-17 and -26. In contrast, methylation of H3 lysine-9 leads to
the suppression of target genes by the recruitment of corepressors. The methylated H3 lysine-9
residues have further been shown to serve as binding sites for proteins that induce chromatin
condensation, directly linking the histone modification to transcriptional repression and the formation
of heterochromatin. Additionally, these modifications of histone tails may also regulate one another,
leading to the establishment of distinct patterns of histone modification that correlate with stable
modification in transcriptional activity.
719
Next Step MCAT Content Review: Biology and Biochemistry
Figure 36.17: Histone modifications play fundamental roles in most biological processes
that are involved in the manipulation and expression of DNA. In multicellular organisms,
facultative heterochromatin regions contain genes that are differentially expressed
through development and/or differentiation and which then become silenced while
constitutive heterochromatin contains permanently silenced genes in genomic regions
such as the centromeres and telomeres. Euchromatin is a far more relaxed environment
containing active genes.
II. Nucleosome Remodeling Factors
In contrast to the enzymes that regulate chromatin structure by modifying histones, nucleosome
remodeling factors are protein complexes that alter the arrangement or structure of nucleosomes. One
of their mechanisms of action involves catalyzing the sliding movement of histone octamers along the
DNA molecules, thereby repositioning the nucleosomes to change the transcription factor accessibility
of specific DNA sequences. Alternatively, nucleosome remodeling factors may act by inducing changes
in the conformation of nucleosomes, again affecting the ability of specific DNA sequences to interact
720
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
with transcriptional regulatory proteins. Like histone modifying enzymes, nucleosome remodeling
factors can be recruited to DNA in association with either transcriptional activators or repressors, and
can alter the arrangement of nucleosomes to either stimulate or inhibit transcription.
III. Regulation of Transcriptional Elongation
The recruitment of histone modifying enzymes and nucleosome remodeling factors by transcriptional
activators stimulates the initiation of transcription by altering the chromatin structure of enhancer and
promoter regions. However, following initiation, RNA polymerase must still elongate the nascent
mRNA transcript through the structurally complex chromatin template. This is facilitated by
elongation factors that become associated with the phosphorylated C-terminal domain of RNA
polymerase II at the initiation of transcription. These elongation factors include histone modifying
enzymes (acetyltransferases and methyltransferases) as well as proteins that transiently disrupt the
structure of nucleosomes during transcription. Although it has been less thoroughly studied than
transcriptional initiation (and is therefore a less likely MCAT passage topic), transcriptional elongation
does present an additional level at which gene expression can be regulated in eukaryotic cells.
IV. DNA Methylation
The methylation of DNA is another general mechanism that controls transcription in eukaryotes.
Cytosine residues in the DNA of fungi, plants, and animals can be modified by the addition of methyl
groups at the 5-carbon position. DNA is methylated specifically at the cytosine residues that precede
guanines in the DNA chain, known as CpG dinucleotides. This methylation is correlated with
transcriptional repression. Methylation commonly occurs within transposable elements, where it
appears that methylation suppresses the movement of transposons throughout the genome. In
addition, DNA methylation is associated with transcriptional repression of some genes along with
alterations in chromatin structure. In plants, miRNAs direct DNA methylation as well as chromatin
modifications of repressed genes; it is unclear whether this also occurs in animals. However, it is
known that genes on the inactive X chromosomes in mammals become methylated following
transcriptional repression by Xist RNA. It appears true, then, that DNA methylation, as well as histone
modification, plays a role in X chromosome inactivation.
One important regulatory role of DNA methylation has been established in the phenomenon
known as genomic imprinting, which controls the expression of some genes involved in mammalian
embryonic development. In most cases, both the maternal and paternal alleles are expressed in diploid
cells. However, there are some imprinted genes which show varied expression depending on whether
they are inherited from the mother or from the father. In some cases, only the paternal allele of an
721
Next Step MCAT Content Review: Biology and Biochemistry
imprinted gene is expressed, and the maternal allele is transcriptionally inactive. For other imprinted
genes, the maternal allele is expressed and the paternal allele is inactive.
DNA methylation appears to play an important role in distinguishing between the paternal
and maternal alleles of imprinted genes. A good example is the gene H19, which is transcribed only
from the maternal copy. The H19 gene is specifically methylated during the development of male, but
not female, germ cells. The union of sperm and egg at fertilization therefore yields an embryo
containing a methylated paternal allele and an unmethylated maternal allele of the gene. Following
DNA replication, these differences are maintained by an enzyme that specifically methylates CpG
sequences of a daughter strand that is hydrogen-bonded to a methylated parental strand. As a result,
the paternal H19 allele remains methylated, staying transcriptionally inactive in both embryonic and
somatic tissues. The paternal H19 allele does become demethylated in the germ line, however,
allowing a new pattern of methylation to be established for transmission to the next generation.
F. ROLE OF NONCODING RNA
Recent advances indicate that gene expression can be regulated not only by the transcriptional
regulatory proteins already discussed, but also by noncoding RNA molecules. One mode of action of
noncoding regulatory RNAs is to inhibit translation by RNA interference, a phenomenon in which
short double-stranded RNAs induce degradation of a homologous mRNA. In addition, noncoding
RNAs can repress transcription by inducing histone modifications that lead to chromatin condensation
and the formation of heterochromatin. MicroRNAs (miRNAs) are naturally-occurring short
noncoding RNAs that function as normal regulators of gene expression. Hundreds of genes encode
miRNAs in both plants and animals, so it appears that gene regulation by these noncoding RNAs is a
widespread phenomenon, even though the functions of most miRNAs have yet to be determined.
miRNAs are transcribed as precursors containing inverted stem-loop structures. These
precursors are then cleaved by an enzyme known as Dicer to form mature miRNAs, which are short
double-stranded RNAs of approximately 20-25 nucleotides. In RNA interference, miRNAs associate
with the RNA-induced silencing complex (RISC), within which the two strands of miRNA separate
and target homologous mRNAs for cleavage. This is shown in Figure 36.18.
722
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
Figure 36.18: RNA interference. 1. Stem-loop structure forms via hydrogen bonding. 2.
Dicer-catalyzed cleavage yields short, mature miRNA segments. 3. One strand of each
segment is degraded; the other strand associates with the protein complex, RISC. 4. The
miRNA-bound complex can base-pair with any target mRNA that contains a sufficiently
complementary sequence. 5. The miRNA-protein complex prevents gene expression
either by degrading target mRNA or blocking its translation.
In transcriptional repression, the miRNAs associate with a different protein complex called the RITS
(RNA-induced transcriptional silencing) complex. The separated miRNA strands then guide the RITS
complex to the homologous gene, most likely by base pairing with the mRNA transcript in association
with RNA polymerase II. RITS then represses transcription by recruiting a histone methyltransferase
that methylates histone H3 lysine-9, leading to the formation of transcriptionally inactive
heterochromatin.
I. X Chromosome Inactivation
The phenomenon of X chromosome inactivation provides another example of the role of noncoding
RNA in regulating gene expression in mammals. In many animals, including humans, females have
two X chromosomes while males have one X and one Y chromosome. The X chromosome contains
hundreds of genes that are not present on the much smaller Y chromosome. Thus, females have twice
the number of X chromosome genes found in most males. Despite this difference, female and male
cells contain equal amounts of the proteins encoded by the majority of X chromosome genes. This
results from a dosage compensation mechanism in which the large majority of genes on one of the two
X chromosomes in female cells are inactivated by being converted to heterochromatin early in
723
Next Step MCAT Content Review: Biology and Biochemistry
development. Consequently, only one copy of most genes located on the X chromosome is available
for transcription in the cells of either females or males.
While the mechanism of X chromosome inactivation has not yet been fully elucidated, the key
elements appear to be a noncoding RNA transcribed from a regulatory gene, called Xist, on the
inactive X chromosome. Xist RNA remains localized to the inactive X, binding to and coating the
chromosome. This leads to the recruitment of a protein complex that induces methylation of histone
H3 lysine-27 and lysine-9, leading to chromatin condensation and conversion of most of the X genes
to heterochromatin.
G. POST-TRANSCRIPTIONAL CONTROL AND RNA PROCESSING
Although transcription is the first and most highly regulated step in gene expression, it is usually only
the beginning of the series of events required to produce functional RNA. Most newly synthesized
RNAs must be modified in various ways to be converted to their functional forms. Bacterial mRNAs
are an exception; they are used immediately as templates for protein synthesis while still being
transcribed. However, the primary transcripts of both rRNAs and tRNAs must undergo a series of
processing steps in prokaryotic as well as eukaryotic cells. Primary transcripts of eukaryotic mRNAs
similarly undergo extensive modifications, including the removal of introns by splicing, before they are
transported from the nucleus to the cytoplasm to serve as templates for protein synthesis. Regulation of
these processing steps and regulation of the rate of mRNA degradation in the cell provide another
level of control of gene expression.
I. Processing of Ribosomal and Transfer RNA
The basic processing of rRNA and tRNA in prokaryotes and eukaryotes is similar, as might be
expected given the common roles of these molecules in protein synthesis. As discussed already,
eukaryotes have four types (5S, 5.8S, 18S, 28S) of ribosomal RNAs, three of which (28S, 18S, and 5.8S
rRNAs) are derived by cleavage of a single long precursor transcript, called pre-rRNA. Prokaryotes
have three ribosomal rRNAs (5S, 23S, 23S), which are equivalent to the 28S, 18S and 5S rRNAs of
eukaryotic cells and are also formed by the processing of a single pre-rRNA transcript. The only
rRNA that is not processed extensively is the 5S rRNA, which is transcribed from a separate gene.
I
n eukaryotic cells, processing of rRNA takes place within the nucleolus of cells. During
processing, pre-rRNA is first cleaved at a site adjacent to the 5.8S rRNA on its 5’ side, yielding two
separate precursors that contain the 18S and 28S + 5.8S rRNAs, respectively. Further cleavage then
converts these to their final products, with the 5.8S rRNA becoming hydrogen-bonded to the 28S
molecule. In addition to these cleavages, rRNA processing involves the addition of methyl groups to
724
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
bases and sugar moieties of specific nucleotides and the conversion of some uridine residues to
pseudouridine.
Like rRNAs, tRNAs in both bacteria and eukaryotes are synthesized as longer precursor
molecules known as pre-tRNAs, some of which contain several individual tRNA sequences. The
processing of the 5’ end of pre-tRNA involves cleavage by the enzyme RNase P; this reaction is of
particular interest, as it is the prototypical model of catalysis by an RNA enzyme. RNase P consists of
RNA and protein components, both of which are required for maximal activity; however, the catalytic
activity of RNase P is due to its RNA component. For this reason, RNase P is categorized as a
ribozyme.
The 3’ end of tRNA is generated by the action of a protein RNase, but the processing of this
end of the tRNA molecule also involves an unusual activity: the addition of a CCA terminus. All
tRNAs have the sequence CCA at their 3’ ends. This sequence is the site of amino acid attachment, so
it is required for tRNA function during protein synthesis. The CCA terminus is encoded in the DNA
of some tRNA genes; in others, it is instead added as an RNA processing step by an enzyme that
recognizes and adds CCA to the 3’ end of all tRNAs that lack this sequence.
Another unusual characteristic of tRNA processing is the extensive post-transcriptional
modification of bases in tRNA molecules. Approximately 10% of all bases in tRNAs are altered to
yield a variety of modified nucleotides at specific positions. The functions of most of these modified
bases are unknown, but some play important roles in protein synthesis by altering the base-pairing
properties of tRNA molecules.
Some pre-tRNAs, as well as pre-rRNA in a few organisms, contain introns that are removed
by splicing. In contrast to other splicing reactions, which involve the activities of catalytic RNAs,
tRNA splicing is mediated by conventional protein enzymes. An endonuclease cleaves the pre-tRNA
at the splice sites to excise the intron, followed by joining of the exons to form a mature tRNA
molecule.
II. Processing of mRNA in Eukaryotes
In contrast to the processing of rRNA and tRNA, the ways in which mRNA is processed by eukaryotes
and prokaryotes is substantially different. In bacteria, ribosomes have immediate access to mRNA,
allowing translation to begin on the nascent mRNA chain while transcription is still underway. In
eukaryotes, mRNA synthesized in the nucleus must first be transported to the cytoplasm before it can
be used as a template for protein synthesis. Moreover, the initial products of transcription in
eukaryotic cells, called pre-mRNAs, are extensively modified before export from the nucleus. The
processing of mRNA includes modification of both ends of the initial transcript, as well as the removal
of introns. Rather than this occurring as a series of independent events following synthesis of pre-
725
Next Step MCAT Content Review: Biology and Biochemistry
mRNA, these processing reactions are closely coordinated steps in gene expression. The C-terminal
domain (CTD) of RNA polymerase II plays a key role in coordinating these processes by serving as a
binding site for the enzyme complexes involved in mRNA processing. The association of these
processing enzymes with the CTD of RNA polymerase II accounts for their specificity in processing
mRNAs; RNA polymerases I and III lack a CTD, so their transcripts are not processed by the same
enzyme complexes.
The first step in mRNA processing is the modification of the 5’ end of the transcript by the
addition of a structure called a 7-methylguanosine cap. The enzymes responsible for capping are
recruited to the phosphorylated CTD following initiation of transcription, and the cap is added after
transcription of the first 20-30 nucleotides of the RNA. Capping is initiated by the addition of a GTP
in reverse orientation to the 5’ terminal nucleotide of the RNA. Methyl groups are then added to this
guanosine residue and to the ribose moieties of one or two 5’ nucleotides of the RNA chain. The 5’
cap stabilizes the RNA and aligns eukaryotic mRNAs on the ribosome during translation.
The 3’ end of most eukaryotic mRNAs undergoes modification as well. This end of the
molecule is defined not by termination of transcription, but by cleavage of the primary transcript and
addition of a poly-A-tail via a processing reaction called polyadenylation. The signals for
polyadenylation include a highly conserved hexanucleotide located 10 to 30 nucleotides upstream of
the site of polyadenylation; its sequence is AAUAAA in mammalian cells. A G-U rich downstream
sequence element acts as another signal. In addition, some genes have a U-rich sequence element
upstream of the AAUAAA. These sequences are recognized by a complex of proteins, including an
endonuclease that cleaves the RNA chain and a separate poly-A polymerase that adds a poly-A tail of
about 200 nucleotides to the transcript. These processing enzymes are associated with the
phosphorylated CTD of RNA polymerase II, and may travel with the polymerase, beginning at the
transcription initiation site. Cleavage and polyadenylation is followed by degradation of the RNA that
has been synthesized downstream of the site of poly-A addition, resulting in the termination of
transcription.
Almost all mRNAs in eukaryotes are polyadenylated, and poly-A tails have been shown to
regulate both translation and mRNA stability. In addition, polyadenylation plays an important
regulatory role in early development, where changes in the length of poly-A tails control mRNA
translation. For example, many mRNAs are stored in unfertilized eggs in an untranslated form with
short poly-A tails. Fertilization stimulates the lengthening of the poly-A tails of these stored mRNAs,
which in turn activates their translation and the synthesis of proteins required for early embryonic
development.
The most structurally dramatic modification of pre-mRNA is the removal of introns by
splicing. The coding sequences of most eukaryotic genes are interrupted by noncoding sequences
(introns) that are precisely excised from the mature mRNA. In mammals, most genes contain multiple
726
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
introns, which typically account for about ten times more pre-mRNA sequences than do the exons.
Studies of the mechanism of splicing has illuminated new mechanisms of gene regulation, and has
revealed novel catalytic activities of RNA molecules.
Figure 36.19: Formation of the primary transcript and its processing in a eukaryotic cell
in the nucleus. The 5’ cap is added before synthesis of the primary transcript is complete.
A noncoding sequence (intron) following the last exon is shown. Splicing can occur
either before or after cleavage and polyadenylation.
H. SPLICING MECHANISMS
The key to understanding pre-mRNA splicing was the development of in vitro systems that efficiently
carried out the splicing reaction. Pre-mRNAs were synthesized in vitro by the cloning of structural
genes, including their introns, adjacent to promoters for bacteriophage RNA polymerases, which
could readily be isolated in large quantities. Transcription of these plasmids could then be used to
prepare large amounts of pre-mRNAs that, when added to nuclear extracts of mammalian cells, were
found to be correctly spliced. As with transcription, the use of such in vitro systems has allowed splicing
to be analyzed in much greater detail than would have been possible in intact cells.
Analysis of the reaction products and intermediates formed revealed that pre-mRNA splicing
proceeds in two steps. First, pre-mRNA is cleaved at the 5’ splice site, and the 5’ end of the intron is
joined to an adenine nucleotide within the intron (near its 3’ end). In this step, an unusual bond is
formed between the 5’ end of the intron and the 2’ hydroxyl group of the adenine nucleotide. The
resulting intermediate is a lariat-like structure in which the intron forms a loop. The second step in
727
Next Step MCAT Content Review: Biology and Biochemistry
splicing then proceeds with simultaneous cleavage at the 3’ splice site and ligation of the two exons.
The intron is thus excised as this lariat-like structure, which is then linearized and degraded within the
nucleus of the intact cells.
These reactions define three critical sequence elements of pre-mRNAs: sequences at the 5’
splice site, sequences at the 3’ splice site, and sequences within the intron at the branch point (the point
at which the 5’ end of the intron becomes ligated to form the lariat-like structure). Pre-mRNAs contain
similar consensus sequences at each of these positions, allowing the splicing apparatus to recognize
pre-mRNAs and carry out the cleavage and ligation reactions involved in the splicing process.
Biochemical analysis of nuclear extracts has revealed that splicing takes place in large
complexes, called spliceosomes, composed of proteins and RNAs. The RNA components of the
spliceosome are five types of small nuclear RNAs (snRNAs) called U1, U2, U4, U5, and U6. These
snRNAs, which range in size from approximately 50 to nearly 200 nucleotides, are complexed with six
to ten protein molecules to form small nuclear ribonucleoprotein particles (snRNPs), which play
central roles in the splicing process. The U1, U2, and U5 snRNPs each contain a single snRNA
molecule, whereas U4 and U6 snRNAs are complexed to each other in a single snRNP.
The first step in spliceosome assembly involves the binding of U1 snRNP to the 5’ splice site
of pre-mRNA. This recognition of 5’ splice sites involves base pairing between the 5’ splice site
consensus sequence and a complementary sequence at the 5’ end of U1 snRNA. U2 snRNP then
binds to the branch point using similar complementary base pairing between U2 snRNA and branch
point sequences. A preformed complex consisting of U4/U6 and U5 snRNPs is then incorporated into
the spliceosome, with U5 binding to sequences upstream of the 5’ splice site. The splicing reaction is
then accompanied by rearrangements of the snRNAs. Prior to the first reaction steps (leading to the
formation of the lariat-like intermediate), U6 dissociates from U4 and displaces U1 at the 5’ splice site.
U5 then binds to sequences at the 3’ splice site, followed by excision of the intron and ligation of the
exons.
Not only do the snRNAs recognize consensus sequences at the branch points and splice sites
of pre-mRNAs, but they also directly catalyze the splicing reaction. The catalytic role of RNAs in
splicing was demonstrated by the discovery that some RNAs are capable of self-splicing; that is, they
can catalyze the removal of their own introns in the absence of other proteins or RNA factors. Selfsplicing was first described during studies of 28S rRNA from protozoa. Further studies have shown
that splicing is catalyzed by the intron, which acts as a ribozyme to direct its own excision from the
pre-rRNA molecule. Additional studies have revealed self-splicing RNAs in mitochondria,
chloroplasts, and bacteria. These self-splicing RNAs are divided into two classes on the basis of their
reaction mechanisms: group I and group II introns. The first step in the splicing of group I introns is
cleavage at the 5’ splice site mediated by guanosine cofactor. The 3’ end of the free exon then reacts
728
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
with the 3’ splice site to excise the introns as a linear RNA molecule. The splicing mechanism of group
I introns is illustrated in Figure 36.20.
Figure 36.20: Splicing mechanism of group I introns. 1. The 3’ OH group of guanosine,
GMP, GDP or GTP attacks the phosphate located at the 5’ splice site. 2. The 3’ OH of the
5’ exon becomes the nucleophile, completing the reaction.
In contrast, the self-splicing reactions of group II introns (as found in some mitochondrial pre-mRNAs)
closely resemble nuclear pre-mRNA splicing in which cleavage of the 5’ splice site results from the
729
Next Step MCAT Content Review: Biology and Biochemistry
attack of an adenosine nucleotide in the intron. As with pre-mRNA splicing, the result is a lariat-like
intermediate, which is then excised.
The similarity between spliceosome-mediated pre-mRNA splicing and self-splicing of group II
introns strongly suggests that the active catalytic components of the spliceosomes are RNAs rather
than proteins. In particular, these similarities suggest that pre-mRNA splicing is catalyzed by the
snRNAs of the spliceosome. Continuing studies of pre-mRNA splicing have provided clear support of
this view, including the demonstration that U2 and U6 snRNAs, in the absence of proteins, can
catalyze the first step in pre-mRNA splicing. Pre-mRNA splicing is thus considered to be an RNAbased reaction, catalyzed by spliceosome snNRAs acting analogously to group II self-splicing introns.
Within the cell, protein components of the snRNPs are also required, however, and participate in both
assembly of the spliceosome and the splicing reaction itself.
A number of protein splicing factors that are not snRNP components also play critical roles in
spliceosome assembly, particularly in identification of the correct splice sites in pre-mRNAs.
Mammalian pre-mRNAs typically contain multiple short exons separated by much larger introns.
Introns frequently contain sequences that resemble splice sites, so the splicing machinery must be able
to identify the appropriate 5’ and 3’ sites at intron/exon boundaries to produce a functional mRNA
molecule. Splicing factors serve to direct spliceosomes to the correct splice sites by binding to specific
RNA sequences and then recruiting U1 and U2 snRNPs to the appropriate sites on pre-mRNA by
protein-protein interactions. For example, the SR splicing factors bind to specific sequences within
exons and act to recruit U1 snRNP to the 5’ splice site. SR proteins also interact with another splicing
factor (USAF), which binds to pyrimidine-rich sequences at 3’ splice sites and recruit U2 snRNP to the
branch point. In addition to recruiting the components of the spliceosome to the pre-mRNA, splicing
factors couple splicing to transcription by associating with the phosphorylated CTD of RNA
polymerase II. This anchoring of the splicing machinery to RNA polymerase is thought to be
important in ensuring that exons are joined in the correct order as the pre-mRNA is synthesized.
I. Alternative Splicing
The central role of splicing in the processing of pre-mRNA opens the possibility of regulation of gene
expression by control of the splicing machinery. Since most pre-mRNAs contain multiple introns,
different mRNAs can be produced from the same gene by different combinations of 5’ and 3’ splice
sites. The possibility of joining exons in varied combinations provides a novel means of controlling
gene expression by generating multiple mRNAs, and therefore multiple proteins, from the same premRNA. This process, known as alternative splicing, occurs frequently in genes of complex eukaryotes.
For example, it is estimated that about 50% of human genes produce transcripts that are alternatively
spliced, considerably increasing the diversity of proteins that can be encoded by the estimated 20,000-
730
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
25,000 genes in mammalian genomes. Because patterns of alternative splicing can vary in different
tissues and in response to extracellular signals, alternative splicing provides an important mechanism
for tissue-specific and developmental regulation of gene expression.
One well-studied example of tissue-specific alternative splicing relates to sex determination in
Drosophila, where alternative splicing of the same pre-mRNA determines whether a fly is male or
female. Alternative splicing of the pre-mRNA of a gene called transformer is controlled by a protein
(SXL) that is only expressed in female flies. The transformer-derived pre-mRNA has three exons, but
different second exons are incorporated into the exon as a result of using alternate 3’ splice sites in the
two different sexes. In males, exon 1 is joined to the most upstream of these 3’ splice sites, which is
selected by the binding of the U2AF splicing factor. In females, the SXL protein binds to this 3’ splice
site, blocking the binding of U2AF. Consequently, the upstream 3’ splice site is skipped in females, and
exon 1 is instead joined to an alternate 3’ splice site that is further downstream. The exon 2 sequences
included in the male transformer mRNA contain a translation termination codon, so no protein is
produced. This termination codon is not included in the female mRNA, so female flies express a
functional transformer protein, which acts as a key regulator of sex determination.
The alternative splicing of transformer illustrates the action of a repressor (the SXL protein) that
functions by blocking the binding of a splicing factor (U2AF). Similarly, a large group of proteins
regulate alternative splicing by binding to silencer sequences in pre-mRNAs. In other cases, alternative
splicing is controlled by activators that recruit splicing factors to splice sites that would otherwise not
be recognized. The best-studied splicing activators are members of the SR protein family, which bind
to specific splicing enhancer sequences. Multiple mechanisms can thus regulate alternative splicing,
and variations in alternative splicing make a major contribution to the diversity of protein expression
during development and differentiation. One example is found in the mammalian ear, which contains
hair cells that respond to sounds of different frequencies. The responsiveness of hair cells to specific
frequencies is thought to be mediated in part by the alternative splicing of a gene encoding a channel
protein.
I. RNA EDITING
RNA editing refers to RNA processing events other than splicing that alter the protein-coding
sequences of some mRNAs. This unexpected form of RNA processing was first discovered in
mitochondrial mRNAs of trypanosomes in which U residues are added or deleted at multiple sites
along the pre-mRNA in order to generate the mRNA. More recently, editing has been described in
mitochondrial mRNA of other organisms, chloroplast mRNAs of higher plants, and nuclear mRNAs
of some mammalian genes.
731
Next Step MCAT Content Review: Biology and Biochemistry
Editing in mammalian nuclear mRNAs, as well as in mitochondrial and chloroplast RNAs of
higher plants, involves single base changes as a result of base modification reactions, similar to those
involved in tRNA processing. In mammalian cells, RNA editing reactions include the deamination of
cytosine to uridine and of adenosine to inosine. One of the best-studied examples is the editing of the
mRNA for apolipoprotein B, which transports lipids in the blood. In this case, tissue-specific RNA
editing results in two different forms of apolipoprotein B. In humans, Apo-B100 is synthesized in the
liver by translation of the unedited mRNA. However, a shorter protein (Apo-B-48) is synthesized in
the intestine as a result of translation of an edited mRNA in which a C has been changed to a U by
deamination. This alteration changes the codon for glutamine (CAA) in the unedited mRNA to the
termination codon (UAA) in the edited mRNA, resulting in synthesis of the shorter Apo-B protein.
Tissue-specific editing of Apo-B mRNA thus results in the expression of structurally and functionally
different proteins in the liver and intestine. The full-length Apo-B100 produced by the liver transports
lipids in the circulation; Apo-B48 functions in the absorption of dietary lipids by the intestine. This
example is shown in Figure 36.21.
Figure 36.21: The effect of C-U RNA editing on the human ApoB gene
732
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
RNA editing by the deamination of adenosine to inosine is the most common form of nuclear RNA
editing in mammals. This form of editing plays an important role in the nervous system, where A-to-I
editing results in single amino acid changes in ion channels and receptors on the surface of neurons.
For example, the mRNAs encoding receptors for the neurotransmitter serotonin can be edited at up to
five sites, potentially yielding 24 different versions of the receptor with different signaling activities.
J. CANCER
Cancer arises from a breakdown of the regulatory mechanisms that govern normal cell behaviors. As
discussed in preceding chapters, the proliferation, differentiation, and survival of individual cells in
multicellular organisms are carefully regulated to meet the needs of the organism as a whole. This
regulation is lost in cancer cells, which grow and divide in an uncontrolled manner either locally or
spread elsewhere throughout the body. This growth can seriously interfere with normal tissue and
organs.
Because cancer results from defects in fundamental cell regulatory mechanisms, it is a disease
that ultimately demands study at the molecular level. Indeed, understanding cancer has been a
paramount objective of molecular and cellular biologists for years. Importantly, studies of cancer cells
have also illuminated the mechanisms that regulate normal cell behavior. In fact, many of the proteins
that play a key role in cell signaling, regulation of the cell cycle, and control of programmed cell death
were first identified because abnormalities in their activities lead to the uncontrolled proliferation of
cancer cells, contributing to our understanding of normal cell regulation.
I. Tumor Viruses
Members of several families of animal viruses, called tumor viruses, are capable of directly causing
cancer in either experimental animals or humans. The viruses that cause human cancer include the
hepatitis B and C viruses, which cause liver cancer, papillomaviruses, which cause cervical and other
anogenital cancers, Epstein-Barr virus, which causes Burkitt’s lymphoma and nasopharyngeal
carcinoma, Kaposi’s sarcoma-associated herpesvirus, which causes Kaposi’s sarcoma, and human Tcell lymphotropic virus, which causes adult T-cell leukemia. In addition, HIV is indirectly responsible
for the cancers that develop in AIDS patients as a result of immunodeficiency.
Tumor viruses, one of the earliest subjects of cancer research, came to serve as models for
cellular and molecular studies of cell transformation, the process by which normal cells are converted
to tumor cells. Their small genomes have made tumor viruses readily amenable to molecular analysis,
733
Next Step MCAT Content Review: Biology and Biochemistry
leading to the identification of viral genes responsible for cancer induction and paving the way to our
current understanding of cancer at the molecular level.
Studies of tumor viruses demonstrated that specific genes, now known as oncogenes, are
capable of inducing cell transformation; these findings provided the first insights into the molecular
basis of cancer. However, more than 80% of human cancers are not induced by viruses and
apparently arise from other causes, such as radiation and chemical carcinogens. Therefore, in terms of
our overall understanding of cancer, it has been critically important that early studies of viral
oncogenes also led to the identification of cellular oncogenes.
II. Retroviral Oncogenes
Viral oncogenes were first defined in Rous sarcoma virus (RSV), a virus which transforms chicken
embryo fibroblasts in culture and induces large sarcomas within 1 to 2 weeks after inoculation into
chickens. In contrast, the closely related avian leucosis virus (ALV) replicates in the same cells as RSV
without inducing transformation. This difference in transforming potential suggested the possibility
that RSV contains specific genetic information responsible for transformation of infected cells. A direct
comparison of the genomes of RSV and ALV was consistent with this hypothesis: the genomic RNA
of RSV is about 10 kb, whereas that of ALV is smaller, about 8.5 kb.
In the early 1970s, a pair of researchers isolated deletion mutants and temperature-sensitive
mutants of RSV that were unable to induce transformation. Importantly, these mutants still replicated
normally in infected cells, indicating that RSV contains genetic information that is required for
transformation but not for virus replication. Further analysis demonstrated that both the deletion and
the temperature-sensitive RSV mutants define a single gene responsible for the ability of RSV to
induce tumors in birds and transform fibroblasts in culture. Because RSV causes sarcomas, its
oncogene is called src. The src gene is an addition to the genome of RSV; it is not present in ALV. It
encodes a 60-kd protein that was the first protein-tyrosine kinase to be identified.
More than 40 different highly oncogenic retroviruses have been isolated from a variety of
animals, including chickens, turkeys, mice, rats, cats, and monkeys. All of these viruses, like RSV,
contain either one or two oncogenes that are not required for virus replication but are responsible for
cell transformation. In some cases, different viruses contain the same oncogenes, but more than two
dozen distinct oncogenes have been identified among this group of viruses. Like src, many of these
genes, such as ras and raf, encode proteins that are now recognized as key components of signaling
pathways that stimulate cell proliferation.
An unexpected feature of retroviral oncogenes is their lack of involvement in viral replication.
Since most viruses are streamlined to replicate as efficiently as possible – this is, after all, their singular
evolutionary task – the existence of viral oncogenes that are not an integral part of the virus life cycle
734
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
appears paradoxical. Scientists were thus led to question where the retroviral oncogenes had
originated and how they had come to be incorporated into viral genomes, a line of investigation that
ultimately led to the identification of cellular oncogenes in human cancers.
Scientists hypothesized that normal cells contain genes that are closely related to retroviral
oncogenes. The normal-cell genes from which the retroviral oncogenes originated are called protooncogenes. These are important cell regulatory genes, in many cases encoding proteins that function
in the signal transduction pathways controlling normal cell proliferation (e.g., src, raf, and ras). The
oncogenes are abnormally expressed or mutated forms of the corresponding proto-oncogenes. As a
consequence of such alterations, the oncogenes induce abnormal cell proliferation and tumor
development.
An oncogene incorporated into a retroviral genome differs in several respects from the
corresponding proto-oncogene. First, the viral oncogene is transcribed under the control of viral
promoter and enhancer sequences, rather than being under the control of normal transcriptional
regulatory sequences. Consequently, oncogenes are usually expressed at a much higher level than the
proto-oncogenes, and are sometimes also expressed in inappropriate cell types. In some cases, such
abnormalities of gene expression are sufficient to convert a normally functioning proto-oncogene into
an oncogene that drives cell transformation.
In addition to such alterations in gene expression, oncogenes frequently encode proteins that
differ in structure and function from those encoded by their normal homologs. Many oncogenes, such
as raf, are expressed as fusion proteins with viral sequences at the amino terminus. Recombination
events leading to the generation of such fusion proteins often occur during and after the capture of
proto-oncogenes by retro-viruses, generating oncogene proteins that function in an unregulated
manner. For example, the viral raf oncogene encodes a fusion protein in which amino-terminal
sequences of the normal Raf protein have been deleted, as shown in Figure 36.22.
735
Next Step MCAT Content Review: Biology and Biochemistry
Figure 36.22: The Raf proto-oncogene protein consists of an amino-terminal
regulatory domain and a carboxy-terminal protein kinase domain. In the viral Raf
oncogene protein, the regulatory domain has been deleted and replaced by partially
deleted viral Gag sequences (Δ Gag). As a result, the Raf kinase domain is
constitutively active, causing cell transformation.
These amino-terminal sequences are critical to the regulation of the normal Raf protein kinase
activity, and their deletion results in unregulated constitutive activity of the oncogene-encoded Raf
protein. This unregulated Raf activity drives cell proliferation, resulting in transformation.
Many other oncogenes differ from the corresponding proto-oncogenes by point mutations,
resulting in single amino acid substitutions in the oncogene products. In some cases, such amino acid
substitutions (like the deletions already discussed) lead to unregulated activity of the oncogene proteins.
An important example of such point mutations is provided by the ras family of oncogenes, the role of
which are discussed in the next section.
III. Oncogenes in Human Cancer
Understanding the origin of retroviral oncogenes raised the question as to whether non-virus-induced
tumors contain cellular oncogenes that are generated from proto-oncogenes by mutations or by DNA
rearrangements during tumor development.
Some of the oncogenes identified in human tumors are cellular homologs of oncogenes that
were previously characterized in retroviruses, whereas others are new oncogenes first discovered in
736
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
human cancers. The first human oncogene identified in gene transfer assays was identified as the
human homolog of the rasH oncogene of Harvey sarcoma virus. Three closely related members of the
ras gene family (rasH, rasK, and rasN) are the oncogenes most commonly encountered in human
tumors. These genes are involved in approximately 20% of all human malignancies, including more
than half of colon cancers and a quarter of lung carcinomas.
The ras oncogenes are not present in normal cells; rather, they are generated in tumor cells as
a consequence of mutations that occur during tumor development. The ras oncogenes differ from their
proto-oncogenes by point mutations resulting in single amino acid substitutions at critical positions. In
animal models, it has been shown that mutations that covert ras proto-oncogenes to oncogenes are
caused by chemical carcinogens, providing a direct link between the mutagenic action of carcinogens
and cell transformation.
The ras genes encode guanine nucleotide-binding proteins that function in transduction of
mitogenic signals form a variety of growth factor receptors. The activity of the Ras proteins is
controlled by GTP or GDP binding, such that they alternate between active (GTP-bound) and
inactive (GDP-bound) states. The mutations characteristic of ras oncogenes have the effect of
maintaining the Ras proteins constitutively in the GTP-bound conformation. In large part, this effect
is a result of nullifying the response of oncogenic Ras proteins to GAP (GTPase-activating protein),
which stimulates hydrolysis of bound GTP by normal RAS. Because of the resulting decrease in their
intracellular GTPase activity, the oncogenic Ras proteins remain in the active GTP-bound state and
drive unregulated cell proliferation.
Point mutations are only one of the ways in which proto-oncogenes are converted to
oncogenes in human tumors. Many cancer cells display abnormalities in chromosome structure,
including translocations, duplications, and deletions. The gene rearrangements resulting from
chromosome translocations frequently lead to the generation of oncogenes. In some cases, analysis of
these rearrangements has implicated already-known oncogenes in tumor development. In other cases,
novel oncogenes have been discovered by molecular cloning and analysis of rearranged DNA
sequences.
The first characterized example of oncogene activation by chromosome translocation was the
involvement of the c-myc oncogene in human Burkitt’s lymphoma, a malignancy of antibodyproducing B lymphocytes. The tumor is caused by chromosome translocations involving the genes that
encode immunoglobulins. For example, virtually all Burkitt’s lymphomas have translocations of a
fragment of chromosome 8 to one of the immunoglobulin gene loci which residue on chromosome 2 (κ
light chain), 14 (heavy chain), and 22 (λ light chain). One such translocation is shown in Figure 36.23.
737
Next Step MCAT Content Review: Biology and Biochemistry
Figure 36.23: The cy-myc proto-oncogene is translocated from chromosome 8 to
the immunoglobulin heavy-chain locus (IgH) on chromosome 14 in Burkitt’s
lymphomas, resulting in abnormal c-myc expression.
The fact that the immunoglobulin genes are actively expressed in these tumors suggested that the
translocation activates proto-oncogenes from chromosome 8 by inserting them into the
immunoglobulin loci. This possibility was investigated by analysis of tumor DNAs with probes for
known oncogenes, leading to the finding that the c-myc proto-oncogene was the chromosome 8
translocation break point in Burkitt’s lymphomas. These translocations inserted c-myc into an
immunoglobulin locus, where it was expressed in an unregulated manner. Such uncontrolled
expression of the c-myc gene, which encodes a transcription factor normally induced only in response
to growth factor stimulation, is sufficient to drive cell proliferation and contribute to tumor
development.
Translocations of other proto-oncogenes frequently result in rearrangements of coding
sequences, leading to the formation of abnormal gene products. The prototype for this process is
translocation of the abl proto-oncogene from chromosome 9 to chromosome 22 in chronic myeloid
leukemia (CML). This translocation leads to fusion of abl with its translocation partner, a gene called
bcr, on chromosome 22. The resulting product is the Bcr/Abl fusion protein in which the normal
amino terminus of the Abl proto-oncogene protein has been replaced by Bcr amino acid sequences.
738
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
The fusion of Bcr sequences results in unregulated activity of the Abl protein-tyrosine kinase, leading
to cell transformation. This translocation event is shown in Figure 36.24.
Figure 36.24: The abl oncogene is translocated from chromosome 9 to chromosome 22,
forming the Philadelphia chromosome in chronic myeloid leukemia (CML). The abl
proto-oncogene, which contains two alternative first exons (1A and 1B), is joined in the
middle of the bcr gene on chromosome 22. Exon 1B is deleted as a result of the
translocation. Transcription of the fused gene initiates at the bcr promoter and continues
through abl. Splicing then generates a fused Bcr/Abl mRNA in which abl exon 1A
sequences are joined to abl exon 2. The Bcr/Abl mRNA is translated to yield a
recombinant Bcr/Abl fusion protein.
739
Next Step MCAT Content Review: Biology and Biochemistry
IV. Gene Amplification
A distinct mechanism by which oncogenes are activated in human tumors is gene amplification, which
results in elevated gene expression. DNA amplification is common in tumor cells, and amplification of
oncogenes may play a role in the progression of many tumors to more rapid growth and increasing
malignancy. Indeed, novel oncogenes have been identified by molecular cloning and characterization
of DNA sequences that are amplified in tumors.
A prominent example of oncogene amplification is the involvement of the N-myc gene, which
is related to c-myc, in neuroblastoma, a childhood tumor of neuronal cells. Amplified copies of N-myc
are frequently present in rapidly growing, aggressive tumors, indicating that N-myc amplification is
associated with the progression of neuroblastoma to increasing malignancy. Amplification of another
oncogene, erbB-2, which encodes a receptor protein-tyrosine kinase, is similarly related to progression
of breast and ovarian carcinomas.
I. Functions of Oncogene Products
The viral and cellular oncogenes encompass a large group (numbering more than 100 in total) that
can contribute to the abnormal behavior of malignant cells. As already noted, many of the proteins
encoded by proto-oncogenes regulate normal cell proliferation; in these cases, the elevated expression
or activity of the corresponding oncogene proteins drives the uncontrolled proliferation of cancer cells.
Other oncogene products contribute to the behavior of cancer cells as well, such as failure to undergo
programmed cell death or defective differentiation.
The function of oncogene proteins in regulation of cell proliferation is illustrated by their
activities in growth factor-stimulated pathways of signal transduction, such as the activation of ERK
signaling downstream of receptor protein-tyrosine kinases. The oncogene proteins within this pathway
include polypeptide growth factors, growth factor receptors, intracellular signaling proteins,
transcription factors, and the cell cycle regulatory cyclin D1.
The action of growth factors as oncogene proteins results from their abnormal expression,
leading to a situation where a tumor cell produces a growth factor to which it also responds. The result
is autocrine stimulation of the growth factor-producing cell, which drives abnormal cell proliferation
and contributes to the development of a wide variety of human tumors.
A large group of oncogenes encode growth factor receptors, most of which are proteintyrosine kinases. These receptors can be converted to oncogene proteins by alterations of their aminoterminal domains, which would normally bind extracellular growth factors. For example, the receptor
for platelet-derived growth factor (PDGF) is converted to an oncogene in some human leukemias by a
chromosome translocation in which the normal amino terminus of the PDGF receptor (PDGFR) is
740
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
replaced by the amino terminal sequence of a transcription factor called Tel. Alternatively, genes that
encode receptors are induced in response to protein-tyrosine kinases and can be activated by gene
amplification or by point mutations that result in unregulated kinase activity. Other oncogenes
(including src and abl) encode nonreceptor protein-tyrosine kinases that are constitutively activated by
deletions or mutations of regulatory sequences.
The Ras proteins play a key role in mitogenic signaling by coupling growth factor receptors to
activation of the Raf protein-serine/threonine kinase, which initiates a protein kinase cascade leading
to activation of ERK MAP kinase. As discussed already, the mutations that convert ras protooncogenes to oncogenes results in constitutive Ras activity, thus causing activation of the ERK and
PI3K/Akt pathways. The raf gene can similarly be converted to an oncogene by deletions that result in
loss of the amino-terminal regulatory domain of the Raf protein. The consequence of these deletions is
unregulated activity of the Raf protein kinase, which also leads to constitutive ERK activation.
Alternatively, raf proto-oncogenes can be converted to oncogenes by point mutations that result in
elevated Raf kinase activity. A schematic of this effector pathway, as well as its potential points of
dysregulation, is shown in Figure 36.25.
Figure 36.25: Canonical Raf effector pathway. Points of mutations implicated in the
development and progression of human cancers, involving Raf-MEK-ERK or PI3K-Akt
pathway dysregulation, are shown.
741
Next Step MCAT Content Review: Biology and Biochemistry
The ERK pathway ultimately leads to the phosphorylation of transcription factors and
alternations in gene expression. As might be expected, many oncogenes encode transcriptional
regulatory proteins that are normally induced in response to growth factor stimulation. For example,
transcription of the fos proto-oncogene is induced as a result of phosphorylation of Elk-1 by ERK. Fos
and the product of another proto-oncogene, Jun, are components of the AP-1 transcription factor,
which activates transcription of a number of target genes, including that which gives rise to cyclin D1,
in growth factor-stimulated cells. Constitutive activity of AP-1, resulting from unregulated expression
of either the Fos or Jun oncogene proteins, is sufficient to drive abnormal cell proliferation, leading to
cell transformation. The Myc proteins similarly function as transcription factors regulated by
mitogenic stimuli, and abnormal expression of myc oncogenes contributes to the development of a
variety of human tumors. Other transcription factors are activated as oncogenes by chromosome
translocations in human leukemias and lymphomas.
The signaling pathways activated by growth factor stimulation ultimately regulate
components of the cell cycle machinery that promote progression through the restriction point G1.
The D-type cyclins are induced in response to growth factor stimulation, at least in part via activation
of the AP-1 transcription factor. These proteins play a key role in coupling growth factor signaling to
cell cycle progression. Perhaps not surprisingly, the gene encoding cyclin D1 is a proto-oncogene,
which can be activated as the oncogene CCND1 by chromosome translocation or gene amplification.
These alterations lead to constitutive expression of cyclin D1, which then drives cell proliferation in
the absence of normal growth factor stimulation. The catalytic partner of cyclin D1, Cdk4, is also
activated as an oncogene, by point mutations in melanomas.
Components of other signaling pathways, including the G protein-coupled signaling
pathways, the NF-κB pathway, and the Hedgehog, Wnt, and Notch pathways, can also act as
oncogenes. For example, activating mutations frequently convert the downstream target of Wnt
signaling, β-catenin, to an oncogene (CTNNB1) in human colon cancers. These activating mutations
stabilize β-catenin, which then forms a complex with Tcf and stimulates transcription of target genes.
The targets of β-catenin/Tcf include the genes encoding c-Myc and cyclin D1, leading to unregulated
cell proliferation. Interestingly, Wnt signaling normally promotes the proliferation of stem cell and
their progeny during the continual epithelial cell renewal that occurs in the colon, indicating that
colon cancer results from abnormal activity of the same pathway that signals physiologically normal
proliferation of colonic epithelial cells.
Although many oncogenes stimulate cell proliferation, the oncogenic activity of some
transcription factors instead results from inhibition of cell differentiation. As noted elsewhere, thyroid
hormone and retinoic acid induce differentiation of a variety of cell types. These hormones diffuse
through the plasma membrane and bind to intracellular receptors that act as transcriptional regulatory
mechanisms. A mutated form of the retinoid acid receptor (PML/RARα) acts as an oncogene protein
742
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
in human acute promyelocytic leukemia. The mutated oncogene receptor appear to interfere with the
action of its normal homologs, thereby blocking cell differentiation and maintaining the leukemic cells
in an actively proliferating state. In the case of acute promyelocytic leukemia, high doses of retinoic
acid can overcome the effects of the PML/RARα protein oncogene protein and induce differentiation
of the leukemic cells. This biological observation has a direct clinical correlation: patients with acute
promyelocytic leukemia can be treated effectively by administration of retinoic acid, which induces
differentiation and blocks continued cell proliferation.
As already emphasized, the failure of cancer cells to undergo programmed cell death
(apoptosis) is a hallmark of cancerous cells. Several oncogenes code proteins that act to promote cell
survival, which, in most animal cells, is dependent on growth factor stimulation. Accordingly, those
oncogenes that encode growth factors, growth factor receptors, and signaling proteins such as Ras act
not only to promote cell proliferation, but also to prevent cell death. The PI3-kinase/Akt signaling
pathway plays an anti-apoptotic role in many growth factor-dependent cells, and the genes encoding
PI3-kinase and Akt act as oncogenes in both retroviruses and human tumors. The downstream targets
of PI3-kinase/Akt signaling include a proapoptotic member of the Bcl-2 family Bad, which is
inactivated as a result of phosphorylation of Akt, as well as the FOXO transcription factor, which
regulates expression of the proapoptotic Bcl-2 family member, Bim. In addition, it is worth mentioning
that Bcl-2 itself was first discovered as the product of an oncogene in human lymphomas. The bcl-2
oncogene is generated by a chromosome translocation that results in elevated expression of Bcl-2,
which blocks apoptosis and maintains cell survival under conditions that normally induce cell death in
the development of cancer.
II. Tumor Suppressor Genes
The activation of cellular oncogenes represents only of two distinct types of genetic alterations involved
in tumor development; the other is inactivation of tumor suppressor genes. Oncogenes drive abnormal
cell proliferation as a consequence of genetic alterations that either increase gene expression or lead to
uncontrolled activity of the oncogene-encoded proteins. Tumor suppressor genes represent the
opposite effect—they control growth, normally acting to inhibit cell proliferation and tumor
development. In many tumors, these genes are lost or inactivated, thereby removing negative
regulators of cell proliferation and contributing to the abnormal proliferation of tumor cells. The
functions of tumor suppressor proteins encoded by tumor suppressor genes can be broadly delineated
into one of several categories:
743
Next Step MCAT Content Review: Biology and Biochemistry
♦ Repression of genes that are essential for the continuance of the cell cycle. If
these genes are not expressed, the cell cycle does not continue, effectively
bringing the cell cycle, and cell division, to a halt.
♦ Coupling the cell cycle to DNA damage. As long as there is damaged DNA in the
cell, it should not divide; the cell cycle should continue only if the detected
damage is repaired.
♦ If the damage cannot be repaired, the cell should initiate apoptosis (programmed
cell death) within the damaged cells.
♦ Some proteins involved in cell adhesion prevent tumor cells from dispersing,
block loss of contact inhibition, and inhibit metastasis. These proteins are known
as metastasis suppressors.
♦ DNA repair proteins are usually classified as tumor suppressors as well, as
mutations in their genes increase the risk of cancer. Examples of these mutations
include HNPCC, MEN1 and BRCA. Furthermore, increased mutation rate from
decreased DNA repair leads to increased inactivation of other tumor suppressors
and activation of oncogenes.
The tumor suppressor gene was identified by studies of retinoblastoma, a rare childhood
ophthalmic tumor. Provided that the disease is detected early, retinoblastoma can be successfully
treated and many patients survive to have families. For this reason, it was recognized that same cases
of retinoblastoma appear to be inherited. In these cases, approximately 50% of the children of an
affected parent develop retinoblastoma, consistent with Mendelian transmission of a single dominant
gene that confers susceptibility to tumor development. Although susceptibility to retinoblastoma is
transmitted as a dominant trait, inheritance of the susceptibility gene is not sufficient to transform a
normal retinal cell into a tumor cell.
Cell transformation associated with the development of retinoblastoma requires two
mutations, which are now known to correspond to the loss of both functional copies of the tumor
susceptibility gene (the Rb tumor suppressor gene) that would be present on homologous chromosomes
of a normal diploid cell. In inherited retinoblastoma, one defective copy of Rb is genetically
transmitted. The loss of this single Rb copy is not by itself sufficient to trigger tumor development, but
retinoblastoma almost always develops in these individuals as a result of a second mutation leading to
the loss of the remaining normal Rb allele. Noninherited retinoblastoma, in contrast, is rare, since its
744
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
development requires two independent somatic mutations to inactivate both normal copies of Rb in
the same cell.
The observation that alleles that code for Rb must be affected before an effect is manifested is
more generally referred to as the “two-hit hypothesis.” This is because if only one allele for a tumor
suppressor gene is damaged, the second can still produce the correct protein. In other words, mutant
tumor suppressors' alleles are usually recessive whereas mutant oncogene alleles are typically
dominant.
The two-hit hypothesis was first proposed by A.G. Knudson for cases of retinoblastoma, when
Knudson observed that the age of onset of retinoblastoma followed second order kinetics. This pattern
implied that two independent genetic events were necessary. Knudson recognized that this was
consistent with a recessive mutation involving a single gene, but requiring bi-allelic mutation.
Oncogene mutations, in contrast, generally involve a single allele because they are gain-of-function
mutations.
The functional nature of the Rb gene as a negative regulator of tumorigenesis is not isolated to
retinoblastoma; it is also involved in more common adult tumors. In particular, studies of the cloned
gene have established that Rb is lost or inactivated in many bladder, breast, and lung carcinomas. The
significance of the Rb tumor gene thus extends beyond retinoblastoma, and mutations of the Rb gene
contribute to a substantial fraction of human cancers. The Rb protein is a key target for the oncogene
proteins of several DNA tumor viruses, including SV40, adenoviruses, and human papillomaviruses,
which bind to Rb and inhibit its activity. Transformations by these viruses thus result, at least in part,
from inactivation of Rb at the protein level, rather than from mutational inactivation of the Rb gene.
Characterization of Rb as a tumor suppressor gene served as the conceptual catalyst for
research that identified many additional tumor suppressor genes that contribute to the development of
a host of human malignancies. Some of these genes were identified as the cause of rare inherited
cancers, playing a role similar to that of Rb in hereditary retinoblastoma. Other tumor suppressor
genes have been identified as genes that are frequently deleted or mutated in common noninherited
cancers of adults, such as colon carcinoma. In either case, evidence strongly supports the proposition
that tumor suppressor genes are involved in the development of both inherited and noninherited forms
of cancer. In fact, mutations of some tumor suppressor genes appear to be the most common
molecular alterations leading to human tumor development.
The second tumor suppressor gene to have been identified is p53, which is frequently
inactivated in a wide variety of human cancers, including leukemias, lymphomas, sarcomas, brain
tumors, and carcinomas of many tissues, including breast, colon, and lung. Certain mutations in the
p53 gene product represent an exception to the “two-hit” rule for tumor suppressors. p53 mutations
can function as a ‘dominant negative,’ meaning that a mutated p53 protein can prevent the function of
normal protein from the un-mutated allele.
745
Next Step MCAT Content Review: Biology and Biochemistry
In total, mutations of p53 play at least some role in more than half of all cancers, making it the most
common target of genetic alterations in human malignancies. It is also of interest that inherited
mutations of p53 are responsible for genetic transmission of a rare hereditary autosomal dominant
cancer syndrome, Li-Fraumeni syndrome, in which affected individuals develop any of several
different types of cancers. In addition, the p53 protein (like RB) is a target for the oncogene proteins of
SV40, adenoviruses, and human papillomaviruses. In one such example, human papillomavirus
(HPV), encodes a protein, E6, which binds to the p53 protein and inactivates it. This mechanism, in
synergy with the inactivation of the cell cycle regulator Rb by the HPV protein E7, allows for repeated
cell division manifested clinically as warts. Certain HPV types, in particular types 16 and 18, can also
lead to progression from a benign wart to low or high-grade cervical dysplasia, which are reversible
forms of precancerous lesions. Persistent infection of the cervix can cause irreversible changes leading
to carcinoma in situ and eventually invasive cervical cancer. This results from the effects of HPV genes,
particularly those encoding E6 and E7—two viral oncoproteins that are preferentially retained and
expressed in cervical cancers by integration of the viral DNA into the host genome.
The p53 protein is continually produced and degraded in cells of healthy people. The
degradation of the p53 protein is associated with binding of MDM2. In a negative feedback loop,
MDM2 itself is induced by the p53 protein. Mutant p53 proteins often fail to induce MDM2, causing
p53 to accumulate at very high levels. Moreover, the mutant p53 protein itself can inhibit normal p53
protein levels. In some cases, single missense mutations in p53 have been shown to disrupt p53 stability
and function. The function of MDM2 as part of the p53 pathway in a normal cell is shown in Figure
36.26.
746
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
Figure 36.26: In a normal cell, p53 is inactivated by its negative regulator, mdm2. Upon
DNA damage or other stresses, various pathways will lead to the dissociation of the p53mdm2 complex. Once activated, free p53 will induce either cell cycle arrest to repair the
cell or apoptosis to discard the damaged cell. The mechanism by which one of these
pathways is selected by p53 is an area of ongoing research.
Like p53, the INK4 and PTEN tumor suppressor genes are very frequently mutated in several
common cancers, including lung and prostate cancers and melanoma. Other tumor suppressor genes
(including APC, TβRII, Smad2, and Smad4) are frequently inactivated in colon cancers. In addition to
being involved in non-inherited cases of this common adult cancer, inherited mutations of the APC
gene are responsible for a rare hereditary form of colon cancer, called familial adenomatous polyposis
(FAP). Individuals with this condition develop hundreds of benign colon adenomas (polyps), some of
which inevitably progress to malignancy. Inherited mutations of two other tumor suppressor genes,
BRCA1 and BRCA2, are responsible for hereditary cases of breast cancer, which account for about 5%
747
Next Step MCAT Content Review: Biology and Biochemistry
of total breast cancer incidence. Additional tumor suppressor genes have been implicated in the
development of brain tumors, pancreatic cancer, and basal skin carcinomas, as well as several rare
inherited cancer syndromes, such as Wilms’ tumor.
III. Products of Tumor Suppressor Genes
In contrast to proto-oncogene and oncogene proteins, the proteins encoded by most tumor suppressor
genes inhibit cell proliferation or survival. Inactivation of tumor suppressor genes therefore leads to
tumor development by eliminating negative regulatory proteins. In many cases, tumor suppressor
proteins inhibit the same regulatory pathways that are stimulated by the products of oncogenes.
The protein encoded by PTEN tumor suppressor gene is an interesting example of
antagonism between oncogene and tumor suppressor gene products. The PTEN protein is a lipid
phosphatase that dephosphorylates the 3 position of phosphatidylinositides, such as
phosphatidylinositol 3,4,5-bisphosphate (PIP3). By dephosphorylating PIP3, PTEN antagonizes the
activities of PI3-kinase and Akt, both of which can act as oncogenes by promoting cell survival.
Conversely, inactivation or less of the PTEN tumor suppressor protein can contribute to tumor
development as a result of increased levels of PIP3 and Akt, and inhibition of programmed cell death.
Proteins encoded by both oncogenes and tumor suppressor genes also function in the
Hedgehog signaling pathway. The receptor Smoothened is an oncogene in basal cell carcinomas,
whereas Patched (the negative regulator of Smoothened) is a tumor suppressor gene.
Several tumor suppressor genes encode transcriptional regulatory proteins. A good example is
the product of WT1, which is frequently inactivated in Wilms’ tumor (a childhood renal tumor). The
WT1 protein is a repressor that appears to suppress transcription of a number of growth-factor
inducible genes. One of the targets of WT1 is thought to be the gene that encodes an insulin-like
growth factor, which is overexpressed in Wilms’ tumor and may contribute to tumorigenesis by acting
as an autocrine growth factor. Inactivation of WT1 may thus lead to abnormal growth factor
expression, which in turn drives cell proliferation. Two other tumor suppressor genes, Smad2 and
Smad4, encode transcription factors that are activated by TGF-β signaling and lead to inhibition of cell
proliferation. Consistent with the activity of TGF-β in inhibiting cell proliferation, the TGF-β receptor
is also encoded by a tumor suppressor gene (TβRII).
The products of the Rb and INK4 tumor suppressor genes regulate cell cycle progression at the
same point as that affected by cyclin D1 and Cdk4, both of which can act as oncogenes. Rb inhibits
progression through the restriction point in G1 by repressing transcription of a number of genes
involved in cell cycle progression and DNA synthesis. In normal cells, passage through the restriction
point is regulated by Cdk4,6/cyclin D complexes, which phosphorylate and inactivate Rb. Mutational
inactivation of Rb in tumors thus removes a key negative regulator of cell cycle progression. The INK4
748
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
tumor suppressor gene, which encodes the Cdk inhibitor p16, also regulates passage through the
restriction point; in normal cells, p16 inhibits Cdk4,6/cyclin D activity. Inactivation of INK4 therefore
leads to elevated activity of Cdk4,6/cyclin D complexes, resulting in uncontrolled phosphorylation of
Rb.
The p53 gene product regulates both cell cycle progression and apoptosis. DNA damage leads
to rapid induction of p53, which activates transcription of both proapoptotic and cell cycle inhibitory
genes. The effect of p53 on apoptosis are mediated in part by activating the transcription of
proapoptotic members of the Bcl-2 family—PUMA and Noxa—that induce programmed cell death.
Unrepaired DNA damage normally induces apoptosis of mammalian cells; this response is presumably
advantageous to the organism because it eliminates cells carry potentially deleterious mutations,
including cells that may develop into cancerous cells. Cells lacking p53 fail to undergo apoptosis in
response to agents that damage DNA, including radiation and many of the drugs used in
chemotherapy treatment. This failure to undergo apoptosis in response to DNA damage contributes to
the resistance of many tumor cells to chemotherapeutic agents. In addition, loss of p53 appears to
interfere with apoptosis induced by other stimuli, such as growth factor deprivation and oxygen
deprivation. These effects of p53 on cell survival are thought to account for the high frequency of p53
mutations in human cancers.
In addition to inducing apoptosis, p53 bocks cell cycle progression in response to DNA
damage by inducing the Cdk inhibitor p21. The p21 protein blocks cell cycle progression by acting as
a general inhibitor of Cdk/cyclin complexes, and the resulting cell cycle arrest presumably allows time
for damaged DNA to be repaired before it is replicated. Loss of p53 prevents this damage-induced cell
cycle arrest, leading to increasing mutation frequencies and a general instability of the cellular
genome. Such genetic instability is a common property of cancer cells, and it may further contribute to
alterations in oncogenes and tumor suppressor genes during tumor progression. These regulatory
mechanisms, as well as other pathways previously discussed, are shown in Figure 36.27.
749
Next Step MCAT Content Review: Biology and Biochemistry
Figure 36.27: RAS induces the transcriptional upregulation of growth factors and
interferes with transforming growth factor-β (TGFβ) signaling through inhibition of
TGFβ receptor expression or downstream signaling by downregulating the expression of
SMAD3, as well as the nuclear accumulation of SMAD2 and SMAD3. RAS also
upregulates the levels of cyclin D1 and suppresses the cyclin-dependent kinase inhibitor
(CDKI) p27. The newly synthesized cyclin D1 associates with and activates the cyclindependent kinases CDK4 and CDK6, leading to the phosphorylation of RB and the
subsequent dissolution of the RB–E2F transcription factor complexes. Once released,
E2F transcription factors transactivate several genes that are required for cell cycle
progression, including cyclin E (CCNE) and cyclin A (CCNA) that induce transition
through the G1/S checkpoint (not shown). Hyperproliferative cues from activation of the
RAS oncogene can result in replicative stress leading to DNA damage. In response to
DNA damage, cells can activate the DNA damage checkpoints to transiently arrest and
restore the integrity of the genome, enter a state of irreversible arrest (senescence) or
undergo apoptosis. Inaccurate repair of DNA damage can lead to mutations and
chromosome aberrations, thereby contributing to tumorigenesis. The asterisk represents
the mutational activation of RAS; P represents phosphorylation.
Although their function remains to be fully understood, the products of the BRCA1 and
BRCA2 genes (which are responsible for some inherited breast and ovarian cancers) also appear to be
involved in checkpoint control of cell cycle progression and repair of double-stranded breaks in DNA.
BRCA1 and BRCA2 thus function as stability genes, acting to maintain the integrity of the genome.
Mutations in genes of this type lead to the development of cancer not as a result of direct effects on cell
proliferation or survival, but because their inactivation leads to a high frequency of mutations in
750
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
oncogenes or tumor suppressor genes. Other stability genes whose loss contributes to the development
of human cancers include the ATM gene, which acts as a DNA damage checkpoint, the mismatch
repair genes that are defective in some inherited colorectal cancers, and the nucleotide excision repair
genes that are mutated in the dermatological condition known as xeroderma pigmentosum.
751
Next Step MCAT Content Review: Biology and Biochemistry
Chapter 36 Problems
Passage 36.1 (Questions 1-4)
In investigating the avian retrovirus RSV, researchers noted that infection of chicken fibroblast cells by
RSV led to neoplastic transformation of the infected host cell. Their previous research indicated that a
single viral gene, src, was responsible. Because many highly oncogenic retroviruses were isolated from
the tumors of infected animals, they hypothesized that retroviral oncogenes are derived from related
genes of host cells. Consistent with this suggestion, normal cells of several species were found to
contain retrovirus-related DNA sequences that could be detected by nucleic acid hybridization, but
that could not alone lead to cell transformation. However, it was unclear whether these sequences
were related to the retroviral oncogenes or to the genes required for virus replication.
To address this question, the researchers isolated transformation-defective mutants of RSV that
sustained a 1.5 kb deletion corresponding to most or all of the src gene. They synthesized a radioactive
DNA probe composed of short single-stranded DNA (ssDNA) fragments complementary to the entire
genomic RNA of normal RSV. This probe was then hybridized to an excess of RNA isolated from
transformation-defective mutants. Fragments of cDNA that were complementary to the viral
replication genes hybridized to the transformation-defective RSV RNA. In contrast, cDNA fragments
that were complementary to src were unable to hybridize and remained single-stranded.
The radioactive src cDNA was then used as a hybridization probe to attempt to detect related DNA
sequences in normal avian fibroblast cells not infected by RSV. The extent of hybridization of src
cDNA to normal chicken, quail and duck DNA is shown in Figure 1. When introduced into non-avian
fibroblast cells, little cDNA hybridization occurred, but transformation was still noted in a large
percentage of observed cells.
752
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
Figure 1: Hybridization of src-specific cDNA to normal chicken, quail, and duck DNA. (Note: final
cDNA hybridization > 50% was considered reflective of significant sequence homology.)
1. Given the size of the deletion observed in the transformation-defective mutants isolated by the
researchers, what is the best prediction regarding a property of the unhybridized region of the srcspecific probe?
A. It is homologous with approximately 1.5 kb of RSV RNA taken from a normal RSV virus.
B. It is homologous with approximately 1.5 kb of RSV RNA in the transformation-defective RSV
mutant.
C. It is homologous with approximately 1.5 kb of cDNA derived from the RNA taken from a
transformation-defective RSV mutant.
D. Its nucleotide composition is identical to that of a 1.5 kb sequence of RNA taken from a normal
RSV virus.
2. Which result most supports the hypothesis that retroviral oncogenes are derived from related genes
present in host cells?
A. cDNA fragments that were complementary to the viral replication genes hybridized to the
transformation-defective RSV RNA.
753
Next Step MCAT Content Review: Biology and Biochemistry
B. cDNA fragments that were complementary to src were unable to hybridize to any segment of RSV
RNA.
C. src cDNA probes hybridized extensively to normal avian DNA.
D. cDNA formed RNA-DNA duplexes when hybridized to transformation-defective RSV RNA.
3. Researchers postulated that the structural similarity between the src gene and homologous
sequences found in normal avian fibroblasts, known as c-src, exists because of the incorporation of host
cell DNA into an ancestral RSV virus. If the researchers are correct, which of the following is most
likely to be true of c-src?
A. Host cell DNA transferred to the RSV ancestor possessed the independent ability to transform
avian fibroblast cells prior to the transfer.
B. RSV causes host cell transformation only when c-src is present.
C. Mutations leading to the oncogenic potential of the src gene occurred in RSV.
D. The structural similarity observed is due to transfection of the host cell by RSV with mutated host
cell DNA.
4. Following mutation, the gene to which the cDNA probe hybridized in avian cells promotes
angiogenesis, cell proliferation and cell migration. Before modification, such a gene is best described as
a(n):
A. oncogene.
B. tumor suppressor gene.
C. reporter gene.
D. proto-oncogene.
The following questions are NOT based on a descriptive passage.
5. Which of the following is NOT true regarding DNA footprinting?
A. It is used to identify protein binding sites on DNA.
B. The DNA sequence is labelled at both ends of the tested strands.
C. The site at which the protein binds will be protected from digestion by DNase.
D. Fragments are subject to electrophoresis following digestion.
754
Chapter 36: Eukaryotic Chromosome Organization and Control of Gene Expression
6. Which of the following statements correctly describes the functionally and structurally different
forms of apolipoprotein B synthesized in the human liver and intestine?
A. Apo-B100 is synthesized in the liver by translation of the unedited mRNA transcript.
B. Apo-B48 is synthesized in the intestine by translation of an edited mRNA transcript in which the
editing reaction has eliminated a stop codon.
C. Apo-B100 is synthesized in the liver by translation of an edited mRNA transcript in which the
editing reaction has generated a stop codon.
D. Apo-B48 is synthesized in the intestine by translation of the unedited mRNA transcript.
7. Which of the following is true of X chromosome inactivation in human females?
I. One of the two X chromosomes is inactivated early in female development.
II. Xist is produced from a gene on one of the two X chromosomes.
III. Xist recruits proteins that induce condensation of chromatin.
A. I only
B. II only
C. II and III only
D. I, II and III
8. What is the principle function of splicing factors that are not components of snRNPs?
A. They introduce double-stranded breaks into unedited mRNA transcripts.
B. They mediate the process by which mRNAs that lack open reading frames are degraded.
C. They direct snRNPs to the correct splice site.
D. They recognize GC-rich inverted repeat segments of mRNA molecules.
755
Next Step MCAT Content Review: Biology and Biochemistry
9. Which of the following regulatory elements is responsible for the division of individual domains of
chromatin into fixed spatial regions and prevents the action of trans-acting elements in one domain
from interacting with regulatory elements in an adjacent domain?
A. Repressor
B. Promoter
C. Enhancer
D. Insulator
10. Can a proto-oncogene be converted to an oncogene without a change or mutation in its coding
sequence?
A. No, a proto-oncogene can only be converted to an oncogene via an activating sequence change.
B. Yes, a proto-oncogene can be activated by a translocation that puts it under the control of a
mutated, inactive promoter sequence.
C. Yes, a proto-oncogene can be activated by a mutation that silences a tumor suppressor gene.
D. Yes, a proto-oncogene may be expressed in abnormal cell types.
756
Chapter 36: Eukaryotic Chromosome Organization and Gene Expression Control
Chapter 36 Solutions
1. A.
According to the passage, the transformation-defective mutants of RSV sustained a 1.5 kb deletion
corresponding to most or all of the src gene. The researchers synthesized a radiolabeled DNA probe
composed of short ssDNA fragments complementary to the genome of normal RSV, and hybridized it
to RNA isolated from the transformation-defective mutant lacking a 1.5 kb segment corresponding to
the RSV src gene. For this reason, it is expected that the segment of the probe which does not
hybridize with RSV RNA-lacking src must be complementary to the region of normal RSV RNAcontaining src and correspond to the 1.5 kb deletion in the transformation-defective mutant. This is
consistent with choice A, and eliminates choices B and C. In the case of choice D, the unhybridized
region will be derived from and complementary to a 1.5 kb sequence of RNA in the normal RSV
virus, but, as it is composed of DNA, its nucleotide composition will not be identical. Choice D is thus
incorrect.
2. C.
The observation stated in choice C is reflected in the text of the passage and in Figure 1. The src
cDNA probes derived from RSV hybridized extensively to normal avian DNA, indicating strong
sequence similarity between the oncogenic src gene of RSV and sequences in normal avian DNA. This
further supports the hypothesis that src in RSV was derived from a similar gene in normal avian DNA.
This conclusion is most similar to the idea referred to in the question. Choices A, B and D are
incorrect, as they only reflect the sequence similarities or differences, respectively, between normal and
transformation-defective RSV. They show no relationship between RSV sequences and
complementary sequences found in normal avian DNA.
3. C.
It is possible that at one point an ancestral virus mistakenly incorporated the c-src gene of its cellular
host. Eventually this normal gene mutated into an abnormally functioning oncogene within RSV.
Once the oncogenic mutated virus (known as v-src) is transfected back into a chicken, it can lead to
cancer. Choice C indicates a step in this sequence of events. The passage indicates that sequences
complementary to src in normal avian cells do not alone lead to cell transformation. This contradicts
choice A. Choice B is also unlikely to be true. The passage states that when introduced into non-avian
fibroblast cells, little cDNA hybridization occurred, but transformation was still noted in a large
percentage of observed cells. This indicates that transformation by the RSV virus did not depend on
the presence of c-src or another structurally similar sequence. Finally, choice D is also false. The
757
Next Step MCAT Content Review: Biology and Biochemistry
significant hybridization between the cDNA probe and normal avian fibroblast cells indicates that
structural similarity between src and c-src exists in cells not infected by, and therefore not previously
transfected with, RSV-related genetic material.
4. D.
Promotion of angiogenesis, cell migration and cellular proliferation are consistent with events seen in
cell transformation and cell reproduction. An oncogene is a gene that normally directs cell growth
which may be associated with cancer. A proto-oncogene is a normal gene that can become an
oncogene due to mutations or increased expression. Proto-oncogenes code for proteins that help to
regulate cell growth and differentiation and as such are often involved in signal transduction and
execution of mitogenic signals, usually through their protein products. Upon mutagenic activation, a
proto-oncogene (or its gene product) becomes a tumor-inducing agent, an oncogene. Prior to
modification, the gene was not affecting cell growth processes but after mutation, it did, making choice
A incorrect and choice D, a proto-oncogene, the correct designation. A tumor suppressor gene is a
gene that protects a cell from transformation. The mutation of tumor suppressor genes, however,
would not directly lead to the promotion of transformative characteristics in cells, including those
mentioned in the question, making choice B incorrect. Choice C is incorrect; a reporter is a gene that
a scientist will attach to a regulatory sequence of another gene of interest in the course of an
experiment. Certain genes are chosen as reporters because the characteristics they confer on
organisms expressing them (via protein production) are easily identified and measured, or because they
are selectable markers. Reporter genes are used as an indication of whether a certain gene has been
taken up by, or expressed in, the cell or organism studied.
5. B.
As discussed in the chapter, during DNA footprinting the DNA sequence is radiolabelled at one end
only. The statements in choices A, C and D are all true of the normal footprinting process, and are
incorrect.
6. A.
Apo-B100 is synthesized in the liver by translation of the unedited mRNA transcript. Apo-B48 is
synthesized in the intestine by translation of an edited mRNA transcript in which the editing reaction
has generated a premature stop codon. Thus choice A is correct and choices B, C and D are incorrect.
7. D.
Roman numerals I, II and III are all true. One of the two X chromosomes is inactivated early in
female development (or in that of XXY males). An RNA called Xist is produced by the Xist gene on
Chapter 36: Eukaryotic Chromosome Organization and Gene Expression Control
one of the two X chromosomes, and binds to most of the genes located on the chromosome. Xist RNA
recruits proteins that induce chromatin condensation and conversion of most of the inactive X
material to heterochromatin.
8. C.
Choice C is correct. Splicing factors that are not components of snRNPs direct snRNPs to the correct
splice sites by binding to specific sequences in the pre-mRNA.
9. D.
Insulators are barrier regulatory elements that divide chromosomes into individual domains of
chromatin structure that can be either chromatin or heterochromatin, but that cannot spread beyond
the insulator. They can also prevent an enhancer in one domain from acting on a promoter in an
adjacent domain. This is choice D. Choice A is incorrect. A repressor is a DNA- or RNA-binding
protein that inhibits the expression of one or more genes by binding to the operator or associated
silencers. Choices B and C are also incorrect. A promoter is a region of DNA that initiates
transcription of a particular gene. Promoters are located near the transcription start sites of genes, on
the same strand and upstream on the DNA (towards the 5' region of the sense strand). An enhancer is
a short region of DNA that can be bound with proteins (activators) to activate transcription of a gene
or genes. These proteins are usually referred to as transcription factors.
10. D.
If expressed in an abnormal cell type, a proto-oncogene or its protein product(s) may function
oncogenically in the cell, even if it would not do so when expressed in a proper cell type. This is choice
D. Choice A is false for the reason given in support of choice D. Choices B and C are also false; a
proto-oncogene can be activated without a change or mutation in its coding sequence by gene
amplification, or if a translocation event puts it under the control of an activator. However, the loss of
a functional tumor suppressor gene alone does not confer oncogenic character on a proto-oncogene.
Some further mutation converting the proto-oncogene to an oncogene is ordinarily still required.
759
Next Step MCAT Content Review: Biology and Biochemistry
Chapter 37
Recombinant DNA and Biotechnology
A. INTRODUCTION
Classical experiments in molecular biology were strikingly successful in developing our
fundamental concepts of the nature and expression of genes. Since these studies were based primarily
on genetic analysis, their success depended largely on the choice of simple, rapidly replicating
organisms; bacteria and viruses served as frequent model systems. It was not clear, however, how these
fundamental principles could be extended to the complexities inherent in eukaryotic cells, since the
genomes of most eukaryotes are thousands of times larger than those of bacteria. These obstacles were
overcome by the development of recombinant DNA technology, which provided scientists with a
means of isolating, sequencing and manipulating genes derived from any type of cell. The application
of recombinant DNA technology has thus enabled detailed molecular studies of the structure and
function of eukaryotic genes and genomes, thereby revolutionizing our understanding of molecular
and cell biology.
Recombinant DNA technology, also called gene cloning or molecular cloning, is a general
term that encompasses a number of experimental protocols leading to the transfer of DNA from one
organism to another. There is no single method that can be used to satisfy this objective. However, a
recombinant DNA experiment often follows a similar sequence:
♦ The DNA, whether cloned, inserted, targeted, or foreign, from a donor organism
is extracted, enzymatically cleaved, and joined to another DNA entity (a cloning
vector). This forms a new, recombinant DNA molecule, often called a DNA
construct.
♦ The DNA construct is transferred into and maintained within a host cell. The
introduction of DNA into a bacterial host cell is called transformation.
♦ Those host cells that take up the DNA construct (transformed cells) are identified
and selected by separation or isolation from untransformed cells.
♦ If required, a DNA construct can be created so that the protein product encoded
by the cloned DNA sequence is produced in the host cell.
800
Chapter 37: Recombinant DNA and Biotechnology
B. RESTRICTION ENZYMES
The first step in the development of recombinant DNA technology was the characterization of
restriction endonucleases, which are enzymes that cleave DNA at specific sequences. These enzymes
were identified in bacteria, where they apparently provide a defense against the entry of foreign DNA
(e.g., from a virus) into the cell. Bacteria have a variety of endonucleases that cleave DNA at hundreds
of distinct recognition sties, each of which consists of a specific sequence of four to eight base pairs.
Many of these nucleotide sequences are palindromic, meaning the base sequence reads the same
backward and forward. An example of such a sequence is shown in Figure 37.1.
Figure 37.1: The sequence of nucleotides in a palindromic recognition site is the same in
the forward and reverse strands when both are read in the same 5’ to 3’ or 3’ to 5’
orientation.
Two types of palindromic sequences exist in DNA. A mirror-like palindrome is similar to that
which could be found in ordinary text, in which a sequence is the same when read in the forward and
backward directions on a single strand of DNA, as in GTAATG. The inverted repeat palindrome is
also a sequence that reads the same in both directions, but the forward and backward sequences are
found in complementary DNA strands (i.e., of double-stranded DNA), as in GTATAC (GTATAC
being complementary to CATATG). The sequence shown in Figure 37.1 is an inverted repeat
palindrome. Inverted repeat palindromes are more common and have greater biological importance
than mirror-like palindromes.
Restriction enzymes also differ in terms of the structure of the ends of the double-stranded
DNA (dsDNA) fragments which they produce. Certain restriction enzymes form so-called “sticky
ends”; others form “blunt ends.” The simplest end of a double stranded DNA molecule is called a
blunt end. In a blunt-ended molecule, both strands terminate in a base pair.
Non-blunt ends are created by various overhangs. An overhang is a stretch of unpaired
nucleotides in the end of a DNA molecule. These unpaired nucleotides can be in either strand,
creating either 3' or 5' overhangs. In most cases, these overhangs are palindromic. An example is given
in Figure 37.2.
801
Next Step MCAT Content Review: Biology and Biochemistry
Figure 37.2: Single nucleotide overhang
The simplest case of an overhang is a single nucleotide. This is most often adenosine, which is
established as a 3' overhang by some DNA polymerases. Most commonly, this is used in cloning PCR
products created by such an enzyme. The product is joined with a linear DNA molecule with 3'
thymine overhangs. Since adenine and thymine form a base pair, this facilitates the joining of the two
molecules by a ligase, yielding a circular molecule.
Longer overhangs are called cohesive ends or sticky ends. They are most often created by
restriction endonucleases when they cut DNA. Often, they cut the two DNA strands four base pairs
from each other, creating a four-base 5' overhang in one molecule and a complementary 5' overhang
in the other. These ends are called cohesive since they are easily rejoined by a ligase. Also, since
different restriction endonucleases usually create different overhangs, it is possible to cut a piece of
DNA with two different enzymes before ligating it to a DNA molecule with ends formed by the same
enzymes. Since the overhangs have to be complementary in order for the ligase to function properly,
the two molecules can only join in one orientation. Sticky-end restriction digests are shown in Figure
37.3.
Figure 37.3: Sticky ends formed by restriction digestion can base pair in
complementary overhang regions.
Naturally occurring restriction endonucleases are categorized into four groups (types I, II, III,
and IV) based on their composition and enzyme cofactor requirements, the nature of their target
sequence, and the position of their DNA cleavage sites relative to the target sequence. The
differentiating characteristics of these endonuclease are detailed below.
802
Chapter 37: Recombinant DNA and Biotechnology
♦ Type I enzymes cleave at sites remote from a recognition site. They require both
ATP and S-adenosyl-L-methionine to function. They are multifunctional proteins
with both restriction and methylase activities.
♦ Type II enzymes cleave within or at short specific distances from recognition
sites; most require magnesium. They are single function restriction enzymes
lacking methylase activity.
♦ Type III enzymes cleave at sites a short distance from a recognition site. They
require ATP, but do not hydrolyze it. S-adenosyl-L-methionine stimulates the
reactions which they catalyze, but is not required for the enzymes to function. The
enzymes exist as part of a complex with a modification methylase that modifies
existing methylated residues in protein.
♦ Type IV enzymes target modified DNA, specifically methylated,
hydroxymethylated and glucosyl-hydroxymethylated DNA.
Since restriction endonucleases digest DNA at specific sequences, they can be used to cleave a
DNA molecule at unique sites, forming fragments of variable lengths. These digested fragments can be
separated according to size by gel electrophoresis, as shown in Figure 37.4.
803
Next Step MCAT Content Review: Biology and Biochemistry
Figure 37.4: Restriction enzyme digestion results in cleavage at specific sequence sites
(shown as grey arrows) in DNA. These fragments are then separated by electrophoresis in
an agarose gel. The DNA fragments migrate toward the positive electrode (anode), with
smaller fragments moving more rapidly through the gel. Following electrophoresis, the
DNA is stained with a fluorescent dye and photographed. The sizes of DNA fragments
are indicated.
The location of cleavage sites for multiple different restriction endonucleases can be used to
generate detailed restriction maps of DNA molecules, such as viral genomes. In addition, individual
DNA fragments, produced by restriction endonuclease digestion, can be isolated following
electrophoresis for further study, including determination of their DNA sequence. The DNA of many
viruses has been characterized by this approach.
Restriction endonuclease digestion alone, however, does not provide sufficient resolution for
the analysis of larger DNA molecules, such as cellular genomes. A restriction endonuclease with a six-
804
Chapter 37: Recombinant DNA and Biotechnology
base-pair recognition site (such as the enzyme EcoRI) cleaves DNA with a statistical frequency of once
every 4096 base pairs. Digestion of the human genome, which is more than three million base pairs
long, would yield more than 500,000 EcoRI fragments. Because it is impossible to isolate single
restriction fragments from such a large pool of digests, restriction endonuclease digestion alone does
yield a homogenous source of DNA suitable for further analysis. Such quantities can, however, be
obtained through molecular cloning.
C. MOLECULAR CLONING
As introduced earlier, the basic strategy in molecular cloning is to insert a DNA fragment of
interest (e.g. a segment of human DNA) into a DNA molecule that serves as a vector. Such a vector
must be capable of independent replication within a host cell. The result is a recombinant molecule or
molecular clone, composed of the DNA insert linked to vector DNA sequences. Large quantities of the
inserted DNA can be obtained if the recombinant molecule is allowed to replicate in an appropriate
host. For example, fragments of human DNA can be cloned in plasmid vectors. This is shown in
Figure 37.5. Plasmids are small circular DNA molecules that can replicate independently in bacteria;
in other words, they can do so without being associated with chromosomal DNA. Recombinant
plasmids carrying human DNA inserts can be introduced into E. coli, where they replicate along with
the bacteria to yield millions of copies of plasmid DNA. The DNA of these plasmids can then be
isolated, generating large quantities of recombinant molecules containing a single fragment of human
DNA. The fragment can then easily isolated from the rest of the vector DNA by restriction
endonuclease digestion and gel electrophoresis, allowing a pure fragment of human DNA to be
analyzed and further manipulated.
The DNA fragments used to create recombinant molecules are usually generated by
digestion with restriction endonucleases. Many of these enzymes cleave their recognition sequences at
staggered sites, leaving overhanging or cohesive single-stranded tails that can associate with each other
by complementary base pairing. The association between such paired complementary ends can be
established permanently by treatment with DNA ligase. Thus, two different fragments of DNA (e.g. a
human DNA insert and a plasmid DNA vector) prepared by digestion with the same restriction
endonuclease can be readily joined to create a recombinant DNA molecule.
The fragments of DNA that can be cloned are not limited to those that terminate in
restriction endonuclease cleavage sites. Synthetic DNA “linkers” containing desired restriction
endonuclease sites can be added to the ends of a DNA fragment, allowing virtually any fragment of
DNA to be ligated to a vector and isolated as a molecular clone.
805
Next Step MCAT Content Review: Biology and Biochemistry
Figure 37.5: Generation of a recombinant DNA molecule. 1. Small, circular DNA
molecules called plasmids are removed from bacterial cells. These plasmids serve as
vectors carrying genes of interest. This plasmid includes antibiotic resistance genes, a
reporter gene responsible for coloration, LacZ, and within the LacZ gene, a multiple
cloning site (also known as a polylinker) containing various restriction sites. 2. Foreign
DNA containing the gene of interest is extracted from the cell. 3. A restriction enzyme
806
Chapter 37: Recombinant DNA and Biotechnology
recognizes its specific restriction site – a short sequence 4-8 base pairs long. 4. The
foreign DNA is cleaved, producing fragments with sticky ends. The restriction enzyme
cuts and opens the circular plasmids. The same enzyme cuts the gene of interest from its
DNA molecule. 5. The sticky ends anneal by forming weak hydrogen bonds. Adding
DNA ligase reattaches the DNA backbones and results in the formation of a recombinant
plasmid. Other plasmids reseal, and are unchanged. 6. Plasmids and bacteria and bacteria
are mixed. Many of the bacteria do not take any plasmids into their cells, many take
plasmids without the foreign DNA in them, and a few will take up the recombinant
plasmid via transformation. 7. Plasmids with an uninterrupted LacZ gene are blue. In the
recombinant plasmids, the inserted gene interrupts the LacZ gene, and the bacteria remain
their original color. Bacteria which do not take up any plasmids also remain uncolored.
Antibiotics are then added and because the plasmid contains the genes for antibiotic
resistance, only bacteria which have incorporated the plasmid survive the antibiotic. The
bacteria can now be sorted according to color, isolating the bacteria which took up the
plasmid containing the gene of interest and uncolored bacteria are allowed to reproduce.
I. cDNA
Cloning is not limited to DNA sequences; RNA sequences can be cloned as well. The first
step is to synthesize a DNA copy of the RNA using the enzyme reverse transcriptase. The DNA
product is called cDNA because it is complementary to the template RNA. cDNA can then be ligated
to vector DNA in the manner previously discussed. Since eukaryotic genes are usually interrupted by
noncoding sequences, which are removed from mRNA by splicing, the ability to clone cDNA as well
as genomic DNA has been critical for understanding gene structure and function. Additionally, cDNA
cloning allows the mRNA corresponding to a single gene to be isolated as a molecular clone.
807
Next Step MCAT Content Review: Biology and Biochemistry
II. Vectors in Recombinant DNA
Depending on the size of the insert DNA and the purpose of the experiment, many different
types of cloning vectors can be used for the generation of recombinant molecules. We will review some
of those here. Other vectors developed for the expression of cloned DNAs and the introduction of
recombinant molecules into eukaryotic cells will be discussed in subsequent sections.
Plasmids are commonly used for cloning genomic or cDNA inserts of up to a few
thousand base pairs. Plasmids usually consist of 2 to 4 kb of DNA, including an origin of replication,
which is the DNA sequence that signals the host cell DNA polymerase to replicate the DNA molecule.
In addition, plasmid vectors carry genes that confer antibiotic resistance, so bacteria carrying the
plasmids can be selected. For example, Figure 37.6 illustrates the isolation of human cDNA clones in a
plasmid vector.
808
Chapter 37: Recombinant DNA and Biotechnology
Figure 37.6: The vector is a small circular molecule that contains an origin of replication
(ori), a gene conferring resistance to ampicillin (Ampr), and a restriction site, which can
be used to insert foreign DNA. Insert DNA (in this case, a human cDNA fragment) is
ligated to the vector, and the recombinant plasmids are used to transform the vector. The
bacteria are plated on medium containing ampicillin, so that only the bacteria that are
ampicillin-resistant because they carry plasmid DNAs are able to form colonies (not
pictured).
A pool of cDNA fragments can be ligated to restriction endonuclease-digested plasmid DNA.
The resulting recombinant DNA molecules are then used to transform E. coli. Antibiotic-resistant
colonies which contain plasmid DNA are selected by exposure to the antibiotic for which the
transformed bacteria possess resistance. Since each recombinant plasmid yields a single antibioticresistant colony, the bacteria present in any given colony will contain a unique cDNA insert. Plasmidcontaining bacteria can then be grown in large quantities and their DNA extracted. The small circular
plasmid DNA molecules, of which there are often hundreds of copies per cell, can be separated from
the bacterial chromosomal DNA; the result is purified plasmid DNA that is suitable for analysis of the
cloned insert.
Bacteriophage λ vectors are also used for the isolation of either genomic or cDNA
clones from eukaryotic cells, and will accommodate larger fragments of insert DNA than plasmids. In λ
cloning vectors, sequences of the bacteriophage genome that are dispensable for virus replication have
been removed and replaced with unique restriction sites for insertion of cloned DNA. These
recombinant molecules can be introduced into E. coli, where they replicate to yield millions of progeny
phages containing a single DNA insert. The DNA of these phages can then be isolated, yielding large
quantities of recombinant molecules containing a single fragment of cloned DNA. The DNA inserts
can be as large as 15 kb and still typically yield a recombinant genome that can be packaged into
bacteriophage λ particles.
For many studies involving analysis of genomic DNA, it is desirable to clone larger
fragments than are accommodated by plasmid or λ vectors. There are five major types of vectors that
are used for this purpose. Cosmid vectors (plasmid vectors that contain cos sites, sites which
circularizes the DNA in the host cytoplasm) accommodate inserts of approximately 45 kb. These
vectors contain bacteriophage λ sequences that allow efficient packaging of the cloned DNA into phage
particles. In addition, cosmids contain origins of replications and the genes for antibiotic resistance
that are characteristic of plasmids, so they are able to replicate as plasmids in bacterial cells. Two other
types of vectors are derived from bacteriophage P1, rather than from bacteriophage λ. Bacteriophage
P1 vectors, which will accommodate DNA fragments of 70-100 kb, contain sequences that allow
recombinant molecules to be packaged in vitro into P1 phage particles and then to be replicated as
809
Next Step MCAT Content Review: Biology and Biochemistry
plasmids in E. coli. P1 artificial chromosome (PAC) vectors also contain sequences of bacteriophage P1,
but are introduced directly as plasmids into E. coli. They will accommodate larger inserts of up to 150
kb. Bacterial artificial chromosome (BAC) vectors are derived from a naturally occurring plasmid of E.
coli—the F factor. The replication origin and other F factor sequences allow BACs to replicate as
stable plasmids carrying inserts of 120-300 kb. Even larger fragments of DNA (250-400 kb) can be
cloned in yeast artificial chromosome (YAC) vectors. These vectors contain yeast origins of replication
as well as other sequences, including centromere and telomeres that allow replication as linear
chromosome-like molecules in yeast cells.
D. DNA SEQUENCING
Molecular cloning allows the isolation of individual fragments of DNA in quantities suitable
for detailed characterization, including the determination of nucleotide sequence. Indeed, determining
the nucleotide sequence of many genes has elucidated not only the structure of their products, but also
the properties of DNA sequences that regulate gene expression. Furthermore, the coding sequences of
novel genes are frequently related to those of previously studied genes, and the functions of newly
isolated genes can often be correctly deduced on the basis of such sequence similarity.
DNA sequencing is usually performed with automated systems that are both rapid
and accurate, so determining the sequence of several kilobases of DNA is a straightforward task. Thus,
it is easier to clone and sequence DNA than it is to determine the amino acid sequence of a protein.
Since the nucleotide sequence of a gene can be readily translated into the amino acid sequence of its
encoded protein, the easiest way of determining protein sequence is the sequencing of a cloned gene or
cDNA.
I. Sanger Chain-Termination Method
The most common method of DNA sequencing is based on premature termination of DNA
synthesis resulting from the inclusion of chain-terminating dideoxynucleotides, which do not contain
the 3’ hydroxyl group, in DNA polymerase reactions. DNA synthesis is initiated at a unique site on the
cloned DNA from a synthetic primer. The DNA synthesis reaction includes each of four
dideoxynucleotides (A, C, G, and T) in addition to their normal counterparts. Each of the four
dideoxynucleotides is labeled with a different fluorescent dye, so their incorporation into DNA can be
monitored. Incorporation of these dideoxynucleotides stops further DNA synthesis because no 3’
hydroxyl group is available as a site for the addition of the next nucleotide. Thus, a series of labeled
DNA molecules is generated, each terminating at the base represented by a specific dideoxynucleotide.
Those fragments of DNA are then separated according to size by gel electrophoresis. As the newly
810
Chapter 37: Recombinant DNA and Biotechnology
synthesized DNA strands are electrophoresed through the gel, they pass through a laser that excites
the fluorescent labels. The resulting emitted light is then detected by a photomultiplier, and a
computer collects and analyzes the resultant data. The size of each fragment is determined by its
terminal dideoxynucleotide, marked by a specific color fluorescence, so the DNA sequence can be
read from the order of fluorescent-labeled fragments as they migrate through the gel. High-throughput
automated DNA sequencing of this type has enabled large-scale analysis required for determination of
the sequences of completed genomes, include that of humans. This process is summarized in Figure
37.7.
Figure 37.7: Dideoxynucleotides, which lack OH groups at the 3’ as well as the 2’
position of deoxyribose, are used to terminate DNA synthesis at specific bases. These
molecules are incorporated normally into growing DNA strands. Because they lack a 3’
OH, however, the next nucleotide cannot be added, so synthesis of that DNA strand
terminates. DNA synthesis is initiated at a specific site with a primer. The reaction
contains the four dideoxynucleotides. When the dideoxynucleotides is incorporated,
DNA synthesis stops, so the reaction yields a series of products extending from the
primer to the base substituted by a fluorescent dideoxynucleotide. These products are
then separated by gel electrophoresis. As the DNA strands migrate through the gel, they
811
Next Step MCAT Content Review: Biology and Biochemistry
pass through a laser beam that excites the fluorescent labels on the dideoxynucleotides.
The emitted light is detected by a photomultiplier, which is connected to a computer that
collects and analyzes the data to determine the sequence of DNA.
E. EXPRESSING CLONED GENES
In addition to enabling determination of the nucleotide sequences of genes – and hence the
amino acid sequences of their protein products – molecular cloning has provided new approaches to
obtaining large amounts of proteins for structural and functional characterization. Many proteins of
interest are present at only low levels in eukaryotic cells and therefore cannot be purified in significant
amounts by conventional biochemical techniques. Given a cloned gene, however, this problem can be
rectified by the engineering of vectors that lead to high levels of gene expression in either bacteria or
eukaryotic cells.
To express a eukaryotic gene in E. coli, the cDNA of interest is cloned into a plasmid
or phage vector (called an expression vector) that contains sequences that drive the expression of the
inserted gene in bacterial cells. Inserted genes often can be expressed at levels high enough that the
protein encoded by the cloned gene corresponds to as much as 10% of the total bacterial protein
complement. Purifying the protein encoded by the cloned gene in quantities suitable for detailed
biochemical or structural studies is a straightforward matter.
It is frequently useful to express high levels of a cloned gene in eukaryotic cells rather
than in bacteria. This mode of expression may be important, for example, to ensure that
posttranslational modifications of the protein, such as additions of carbohydrates or lipids, occur
normally. This protein expression in eukaryotic cells can be achieved, as in E. coli, by insertion of the
cloned gene into a (usually virally-derived) vector that directs high-level gene expression. One system
frequently used for protein expression in eukaryotic cells is infection of insect cell baculovirus vectors,
wherein exceedingly high levels of expression of genes inserted in place of a viral structural protein
occurs. Alternatively, high levels of protein expression can be achieved using appropriate vectors in
mammalian cells.
Expression of cloned genes in yeast is particularly useful because simple methods of
yeast genetics can be employed to identify proteins that interact with one another. In this type of
analysis, known as the yeast two-hybrid system, two different cDNAs (for example, from human cells)
are joined to two distinct domains of a protein that stimulates expression of a target gene in yeast.
Figure 37.8 illustrates a yeast two-hybrid system.
812
Chapter 37: Recombinant DNA and Biotechnology
Figure 37.8: A yeast-two hybrid system. cDNAs of two human proteins are cloned and
fused with two distinct domains of a yeast protein that stimulates transcription of a target
gene. The two recombinant cDNAs are introduced into a yeast cell. Domain 1 binds DNA
sequences at a site upstream of the target gene, and domain 2 stimulates target gene
transcription. The interaction between the two human proteins can thus be detected by
expression of the target gene in transformed yeast.
Yeast are then transformed with hybrid cDNA clones to test for interactions between the two
proteins. If the human proteins interact with each other, they will bring the two domains of the yeast
protein together, resulting in stimulation of target expression in the transformed yeast. Expression of
the target gene can be easily detected by the growth of yeast in a specific medium or by the production
of an enzyme that produces a blue yeast colony, so the yeast two-hybrid system provides a
straightforward method to evaluate protein-protein interactions. Indeed, high-throughput yeast twohybrid screens have been used to construct large-scale interaction maps of thousands of proteins in
eukaryotic cells.
F. DETECTION OF NUCLEIC ACIDS
The advent of molecular cloning has enabled the isolation and characterization of individual
genes from eukaryotic cells. Understanding the roles of genes within cells, however, requires analysis of
intracellular organization and expression of individual genes and their encoded proteins. In this
section, the basic procedures used for detection of specific nucleic acids will be discussed. These
approaches are important for a wide variety of studies, including the mapping of genes to
chromosomes and the analysis of gene expression.
I. Amplification of DNA by the Polymerase Chain Reaction
813
Next Step MCAT Content Review: Biology and Biochemistry
Molecular cloning allows individual DNA fragments to be propagated in bacteria and isolated
in large amounts. An alternative method for isolating large amounts of a single DNA molecule is the
polymerase chain reaction (PCR). Provided that some sequence of the DNA molecule of interest is
known, PCR can achieve a striking amplification of DNA content via reactions carried out entirely in
vitro. Essentially, DNA polymerase is used for repeated replications of a defined segment of DNA. The
number of DNA molecules increases exponentially, doubling with each round of replication, so a
substantial quantity of DNA can be obtained from a relatively small initial sample of template copies.
For example, a single DNA molecule amplified through 30 cycles of replication would theoretically
yield 230, or more than a billion, progeny molecules. Single DNA molecules can thus be amplified to
yield readily detectable quantities of DNA that can be isolated by molecular cloning or further
analyzed directly by restriction endonuclease digestion or nucleotide sequencing. The general
procedure for PCR amplification is shown in Figure 37.9.
Figure 37.9: The target region of DNA to be amplified is flanked by two strands used to
prime DNA synthesis. In the first step of each cycle, the starting dsDNA is separated and
then cooled to allow the primers, usually oligonucleotides 15-20 bases long, to bind to
each strand of ssDNA. Taq polymerase is used to synthesize new DNA strands from the
primers, resulting in the formation of two new DNA molecules. The process can be
repeated for multiple cycles, each resulting in a twofold amplification of DNA. 1.
Denaturation at 94-96°C. 2. Annealing at approximately 68°C. 3. Elongation at
approximately 72°C.
814
Chapter 37: Recombinant DNA and Biotechnology
The starting material in PCR amplification of DNA can be either a cloned fragment
of DNA or a mixture of DNA molecules – for example, total DNA from human cells. A specific region
of DNA can be amplified from such a mixture, provided that the nucleotide sequence surrounding the
region is known so that primers can be designed to initiate DNA synthesis at the desired point. Such
primers are usually chemically synthesized oligonucleotides containing 15-20 bases of DNA. Two
primers are used to initiate DNA synthesis in opposite directions from complementary DNA strands.
The reaction is started by heating the template DNA to a high temperature, typically 95°C, to
separate the two strands. The temperature is then lowered to allow the primers to pair with their
complementary sequences on the template strands. DNA polymerase then uses the primers to
synthesize a new strand complementary to each template. Thus, through a single cycle of
amplification, two new DNA molecules are synthesized from one template molecule. This process can
be repeated multiple times, with a twofold increase in the number of DNA molecules following each
round of replication.
The multiple cycles of heating and cooling involved in PCR are performed by
programmable heating blocks called thermocyclers. The DNA polymerases used in these reactions are
heat-stable enzymes from bacteria such as Thermus aquaticus, which reside in hot springs where
temperatures can exceed 75° C. (DNA polymerase derived from Thermus aquaticus is called Taq
polymerase.) Because these polymerases remain stable even at high temperatures, they are used to
separate the strands of DNA in double-stranded DNA, so PCR amplification can be performed
rapidly and automatically. RNA sequences can also be amplified by this method if reverse
transcriptase is used to synthesize a cDNA copy prior to PCR amplification.
If the sequence of a target gene is known sufficiently well, a primer for it can be
specified. Given this, PCR amplification provides a powerful tool for detecting small amounts of
specific DNA or RNA molecules in a complex mixture of other molecules. In such a situation, the only
DNA molecules that will be amplified by PCR are those containing sequences complementary to the
primers used in the reaction. Therefore, PCR can selectively amplify a specific template from
heterogeneous mixtures, such as total cell DNA or RNA. This extraordinary sensitivity has made PCR
an important method for a variety of applications, including analysis of gene expression in cells where
target DNA is available in only small quantities. The DNA segments amplified by PCR can also be
directly sequenced or ligated to vectors and propagated as molecular clones. PCR thus allows the
amplification and cloning of any segment of DNA for which primers can be designed. Since the
complete genome sequences of many organisms are now known, PCR can be used to amplify and
clone a wide array of desired DNA fragments.
II. Nucleic Acid Hybridization
815
Next Step MCAT Content Review: Biology and Biochemistry
Another tool in the repertoire of molecular biologists takes advantage of the specific base
pairing between complementary strands of DNA or RNA. At high temperatures (90-100°C), the
complementary strands of DNA denature, yielding single-stranded DNA (ssDNA). If such ssDNA
molecules are then incubated under appropriate conditions, at temperatures close to 65°C, they will
re-nature and reform dsDNA as dictated by the pattern of complementary base pairing; this process is
called nucleic acid hybridization. Nucleic acid hybrids can be formed between two strands of DNA,
two strands of RNA, or one strand of DNA and one of RNA.
As discussed above, hybridization between the primers and the template DNA provides the
specificity of PCR amplification. In addition, a variety of other methods use nucleic acid hybridization
as a means for detecting DNA or RNA sequences that are complementary to any isolated nucleic acid,
such as a cloned DNA sequence. The cloned DNA is labeled with either radioactive nucleotides or
with modified nucleotides that can be detected by fluorescence or chemiluminescence. This labeled
DNA is then used as a probe for hybridization to complementary DNA or RNA sequences, which are
detected by virtue of the radioactivity, fluorescence, or luminescence of the resulting double-stranded
hybrids.
III. Southern Blotting
Southern blotting is widely used for the detection of specific genes in cellular DNA. The DNA
to be analyzed is digested with a restriction endonuclease, and the digested DNA fragments are
separated by gel electrophoresis. The gel is then overlaid with nitrocellulose filter paper or a nylon
membrane to which the DNA fragments are blotted (transferred) to yield a replica of the gel. The filter
is then incubated with a labeled probe, which hybridizes to the DNA fragments that contain the
complementary sequence, allowing visualization of these specific fragments of DNA. The steps of the
Southern blotting procedure are shown in Figure 37.10.
816
Chapter 37: Recombinant DNA and Biotechnology
Figure 37.10: Southern blotting. 1. DNA is digested by restriction endonuclease treatment
and the resultant restriction fragments of different sizes are separated by gel
electrophoresis. 2. The DNA is denatured and transferred to a filter by passage of a salt
solution through the gel. 3. The filter is hybridized with a labeled probe, which binds to
complementary DNA sequences in buffer solution. 4. The probe bound to the filter is
detected by exposure to film, which reveals the DNA fragments to which the probe
hybridized.
The capillary blotting system shown in step 2 from Figure 37.10 is shown in more detail
in Figure 37.11.
817
Next Step MCAT Content Review: Biology and Biochemistry
Figure 37.11: Capillary blotting system used for the transfer of DNA from an
electrophoresis gel to a blotting membrane.
IV. Northern Blotting
As you may have guessed from its name, Northern blotting is a variation of the Southern
blotting technique. It is used for the detection of RNA, rather than DNA. In this method, total cellular
RNAs are extracted and fractionated according to size by gel electrophoresis. As in Southern blotting,
the RNAs are transferred to a filter and detected by hybridization with a cloned probe. Northern
blotting is frequently used in studies of gene expression – for example, to determine whether specific
mRNAs are present in different types of cells. The general procedure for Northern blotting is shown in
Figure 37.12.
Figure 37.12: Northern blotting. 1. RNA is extracted from sample. 2. RNA is fractionated
by size via gel electrophoresis. 3. RNA is transferred to filter. 4. RNA is fixed to
membrane when exposed to UV radiation or heat. 5. Labeled probes are added. 6.
Labeled RNA is visualized on x-ray film.
818
Chapter 37: Recombinant DNA and Biotechnology
Nucleic acid hybridization can also be used to identify molecular clones that contain specific
cellular DNA inserts. The first step in isolation of either genomic or cDNA clones is frequently the
preparation of recombinant DNA libraries, collections of clones that contain all the genomic or
mRNA sequences of a particular cell type. For example, a genomic library of human DNA might be
prepared by cloning random DNA fragments of about 15 kb in a λ vector. Since the human genome is
approximately 3 x 106 kb, the complete human genome would be represented in a collage of
approximately 500,000 such clones. Any gene for which a probe is available can then be isolated from
such a recombinant library.
The recombinant phages are plated on E. coli, and each phage replicates to produce a plaque
on the lawn of bacteria. The plaques are then blotted onto filter in a process similar to the transfer of
DNA from a gel to a filter during Southern blotting, and the filters are hybridized with a labeled probe
to identify the phage plaques that contain the gene of interest. A variety of probes can be used for such
experiments. For example, a cDNA clone can be used as a probe to isolate the corresponding genomic
clone, or a gene cloned from one species (e.g., mouse) can be used to isolate a related gene from a
different species (e.g., human). The appropriate plaque can then be isolated from the original plate in
order to propagate the recombinant phage that carries the desired DNA insert. Similar procedures can
be used to screen bacterial colonies carrying plasmid DNA clones, so specific clones can be isolate by
hybridization from either phage or plasmid libraries. Figure 37.13 shows a protocol for screening a
recombinant library by hybridization.
819
Next Step MCAT Content Review: Biology and Biochemistry
Figure 37.13: Screening of recombinant DNA libraries by the colony hybridization
procedure. 1. Fragments of cell DNA are cloned in a bacteriophage λ vector and
packaged into phage particles, yielding recombinant phage carrying different cell inserts.
2. The phage are used to infect bacteria, forming plaques. 3. The culture is overlaid with
filter paper; some of the phages in each plaque are transferred to the filter. 4. The phage
DNA is then hybridized with a labeled probe to identify the phage plaque containing the
desired gene. The appropriate phage plaque can then be isolated from the original culture
plaque.
V. DNA Microarrays
Rather than analyzing one gene at a time, as in Southern or Northern blotting, hybridization
to DNA microarrays allows tens of thousands of genes to be analyzed simultaneously. As the complete
sequences of eukaryotic genomes have become available, hybridization of DNA microarrays has
enabled researchers to undertake global analyses of sequences present in either cellular DNA or RNA
samples. A DNA microarray consists of a glass slide or membrane filter on which oligonucleotides or
fragments of cDNA are printed by a robotic system in small spots at high density. Each spot on an
820
Chapter 37: Recombinant DNA and Biotechnology
array consists of a single oligonucleotide or cDNA. More than 10,000 unique DNA sequences can be
printed onto a typical glass microscope slide, so it is readily possible to produce DNA microarrays
containing sequences representing all of the genes in cellular genomes. One widespread application of
DNA microarrays is in the study of gene expression; for example, it can be used to compare the genes
expressed by two different cell types. In an experiment of this type, cDNA probes are synthesized from
the mRNAs expressed in each of the two cell types (e.g. cancer cells and normal cells). The two cDNAs
are labeled with different fluorescent dyes (typically red and green) and a mixture of the cDNAs is
hybridized to a DNA microarray in which 10,000 or more human genes are represented as single
spots. The array is then analyzed using a high-resolution laser scanner, and the relative extent of
transcription of each gene in the cancer cells compared to the normal cells is indicated by the ratio of
red to green fluorescence at the a given position on the array. This procedure is shown in Figure
37.13.
Figure 37.13: DNA microarrays. An example of comparative analysis of gene expression
in cancerous cells and normal cells is shown. mRNAs extracted from cancer cells and
821
Next Step MCAT Content Review: Biology and Biochemistry
normal cells are used as templates for synthesis of cDNA probes labeled with different
fluorescent dyes. Here, cDNA derived from cancer cells carries a red fluorescent label
and cDNA from normal cells carries a green label. The two cDNA probes are mixed and
hybridized to a DNA microarray containing spots of oligonucleotides corresponding
collectively to 10,000 or more distinct human genes. The relative expression of each gene
in cancer cells compared to normal cells is indicated by the ratio of red to green
fluorescence at each position on the microarray.
V. In situ Hybridization
Nucleic acid hybridization can be used to detect homologous DNA or RNA sequences not
only in cell extracts, but also in chromosomes or intact cells; this procedure called in situ hybridization.
In this case, the hybridization of fluorescent probes to specific cells or subcellular structures is analyzed
by microscopic examination. For example, labeled probes can be hybridized to intact chromosomes in
order to identify the chromosomal regions that contain a gene of interest. In situ hybridization can also
be used to detect specific mRNAs in different types of cells within a tissue.
G. UNDERSTANDING GENE FUNCTION IN EUKARYOTES
The recombinant DNA techniques discussed in the preceding sections provide powerful
approaches to the isolation and detailed characterization of the genes of eukaryotic cells.
Understanding the function of those genes, however, requires analysis of the gene within cells or intact
organisms, not simply as a molecular clone in bacteria. In classical genetics, gene function has
generally been revealed by the altered phenotypes of mutant organisms. The advent of recombinant
DNA has added a new dimension to studies of this function. Namely, it has become possible to
investigate the function of a cloned gene directly by reintroducing the cloned DNA into eukaryotic
cells. In simpler eukaryotes, such as yeasts, this technique has made possible the isolation of molecular
clones corresponding to virtually any mutant gene. In addition, there are several methods by which
cloned genes can be introduced into cultured animal and plant cells, as well as intact organisms, for
functional analysis. These approaches can be coupled with the ability to introduce mutations in cloned
DNA in vitro, extending the power of recombinant DNA to allow functional studies of the genes of
more complex eukaryotes.
I. Genetic Analysis in Yeasts
822
Chapter 37: Recombinant DNA and Biotechnology
Yeasts are particularly advantageous for studies of eukaryotic molecular biology. The genome
of Saccharomyces cerevisiae, which consists of approximately 1.2 x 107 base pairs, is nearly 200 times
smaller than the human genome. Moreover, yeasts can easily be grown in culture, reproducing with a
division time of about 2 hours. Thus yeasts offer the same basic advantages – a small genome and
rapid reproduction – that are afforded by bacteria.
Furthermore, mutations in yeast can be identified as readily as in E. coli. For example,
yeast mutants that require a particular amino acid or other nutrient for growth can easily be isolated.
In addition, yeasts with defects in genes required for fundamental cell processes (in contrast to
metabolic defects) can be isolated as temperature-sensitive mutants. Such mutants encode proteins that
are functional at one temperature (the permissive temperature) but not another (the non-permissive
temperature); in contrast, normal proteins are functional at both. A yeast with a temperature-sensitive
mutation in an essential gene can be identified by its ability to grow only at the permissive
temperature. The ability to isolate such temperature-sensitive mutants has allowed the identification of
yeast genes controlling many fundamental cell processes, such as RNA synthesis and processing,
progression through the cell cycle, and transport of proteins between cellular compartments.
The relatively simple genetics of yeast also enables a gene corresponding to any yeast
mutation to be cloned on the basis of its functional activity. First, a genomic library of normal yeast
DNA is prepared in vectors that replicate as plasmids in yeasts as well as in E. coli. The small size of the
yeast genome means that a complete library consists of only a few thousand plasmids. A mixture of
such plasmids is then used to transform a temperature-sensitive yeast mutant, and transformants that
are able to grow at the non-permissive temperature are selected. Such transformations have acquired a
normal copy of the gene of interest on plasmid DNA, which can then be easily isolated from the
transformed yeast cells for further characterization.
Yeast genes encoding a wide variety of essential proteins have been identified in this
manner. In many cases, such genes isolated from yeasts have also been useful in identifying and
cloning related genes from mammalian cells. Thus, the simple genetics of yeast has not only provided
an important model for eukaryotic cells, but has also led directly to the cloning of related genes from
more complex eukaryotes.
II. Gene Transfer in Plants and Animals
Although the cells of complex eukaryotes are not amenable to the simple genetic
manipulation possible in yeasts, gene function can still be assayed by the introduction of cloned DNA
into plant and animal cells. Such experiments, generally called gene transfer, have proven critical to
addressing a wide variety of questions, including studies of the mechanisms that regulate gene
expression and subsequent protein processing. In addition, gene transfer has enabled the identification
823
Next Step MCAT Content Review: Biology and Biochemistry
and characterization of genes that control animal cell growth and differentiation, including a variety of
genes responsible for abnormal growth of human cancer cells.
The method for introducing DNA into animal cells was initially developed for infectious viral
DNA and is therefore called transfection (a portmanteau derived from the words transformation and
infection). DNA can be introduced into animal cells in culture by one of a number of methods,
including direct microinjection into the cell nucleus, coprecipitation of DNA with calcium phosphate
to form small particles that are taken up by the cells, incorporation of DNA into lipid vesicles called
liposomes that fuse with the plasma membrane, and exposure of cells to brief electrical pulses that
transiently open pores in the cellular plasma membrane in a process known as electroporation. The
DNA taken up by most cells is transported to the nucleus, where it can be transcribed for several days;
this is a phenomenon known as transient expression. In a smaller fraction of cells (usually less than
1%), the foreign DNA becomes stably integrated into the cell genome and is transferred to progeny
cells at cell division, just as with any other cellular gene. These stably transformed cells can be isolated
if the transfected DNA contains a selectable marker, such as resistance to a drug that inhibits the
growth of normal cells. Thus, any cloned gene can be introduced into mammalian cells by being
transferred along with a drug resistance marker that can be used to isolate stable transformants. The
effects of such cloned genes on cell behavior – for example, cell growth and differentiation – can then
be analyzed.
Animal viruses can also be used as vectors for more efficient introduction of cloned DNA into
cells. Retroviruses are particularly useful in this respect, since their life cycles involve the stable
integration of DNA into the genome of infected cells. Consequently, retroviral vectors can be used to
efficiently introduce cloned genes into a wide variety of cell types, making them an important vehicle
for a broad array of applications.
Cloned genes can also be introduced into the germ line of multicellular organisms, allowing
them to be studied in the context of the intact animal rather than in cultured cells. This is shown in
Figure 37.14. One method used to produce mice that carry such transgenic, or foreign, genes is the
direct microinjection of cloned DNA into the pronucleus of a fertilized egg. The injected eggs are then
transferred to foster mothers and allowed to develop to term. In a fraction of the progeny (usually less
than 10%), the foreign DNA will have integrated into the genome of the fertilized egg and is therefore
present in all cells of the animal. Since the foreign DNA exists in both the germ and somatic cells of
the animal, it is transferred by breeding to new progeny.
III. Embryonic Stem Cells
Embryonic stem (ES) cells provide an alternative means of introducing cloned genes into
mice. ES cells can be established in culture from early mouse embryos. They can also be reintroduced
824
Chapter 37: Recombinant DNA and Biotechnology
into early embryos, where they participate normally in development and can give rise to cells in all
tissues of the mouse, including germ cells. It is thus possible to introduce cloned DNA into ES cells in
culture, select stably transformed cells, and then introduce those cells back into mouse embryos. Such
embryos give rise to chimeric offspring in which some cells are derived from the normal embryo cells
and some from the transfected ES cells. In some such mice, the transfected ES calls are incorporated
into the germ line. Breeding these mice therefore leads to the direct inheritance of the transfected gene
by their progeny. This is shown, along with the pronucleus method of producing transgenic animals,
in Figure 37.14.
Figure 37.14: Introduction of genes to produce transgenic mice. Microinjection method
(bottom): DNA is microinjected into one of the two pronuclei of a fertilized mouse egg
(fertilized eggs contain two pronuclei, one from the egg and one from the sperm). The
microinjected eggs are then transferred to foster mothers and allowed to develop. Some
of the offspring are transgenic, meaning that they have incorporated the DNA into their
genome. Embryonic stem cell method (top): ES cells are cultured cells derived from early
mouse embryos (blastocysts). DNA can be introduced into these cells in culture, and
stably transformed ES cells can be isolated. These transformed ES cells can then be
injected into a recipient blastocyst, where they are able to participate in normal
development of the embryo. Some of the progeny mice that develop after transfer of
825
Next Step MCAT Content Review: Biology and Biochemistry
injected embryos to foster mothers therefore contain cells derived from transformed ES
cells, as well as from the normal cells of the blastocyst. Since these mice are a mixture of
two different cell types, they are referred to as chimeric. Offspring carrying the
transfected gene can be produced by the breeding of chimeric mice in which descendants
of the transformed ES cells have been incorporated into the germ line.
III. Gene Transfer in Plants
Cloned DNA can also be introduced into plant cells. One approach is to bombard plant cells
with DNA-coated microprojectiles, such as small particles of tungsten. The DNA-coated particles are
projected directly into the plant cells; some of the cells are killed by the effects of the impact, but others
survive and become stably transformed. In addition, the tumor inducing (Ti) plasmid, taken from the
bacterium Agrobacterium tumefaciens, provides a novel vehicle for the introduction of cloned DNA into
many species of plants. In nature, Agrobacterium attaches to the leaves of plants, and the Ti plasmid is
transferred into plant cells where it becomes incorporated into sensitive cells of the host. Since many
plants can be regenerated from single cultured cells, transgenic plants can be established directly from
the cells into which recombinant DNA has been introduced in culture. This procedure is much
simpler than the production of transgenic animals! Indeed, many economically important types of
plants, including tomatoes, soybeans, corn, and potatoes are transgenic varieties.
IV. Mutagenesis of Cloned DNAs
In classical genetic studies, as with those conducted using bacteria or yeasts, mutants are the
key to identifying genes and understanding their function. In such studies, mutant genes are detected
because they result in observable phenotypic changes – for example, temperature-sensitive growth or a
specific nutritional requirement. The isolation of genes by recombinant DNA, however, has opened a
different approach to mutagenesis. It is now possible to introduce any desired alteration into a cloned
gene and to determine the effect of the mutation on gene function. Such procedures have been called
reverse genetics, since a mutation is introduced into a gene first and its functional consequence is
determined later. Introducing mutations into cloned DNA is called in vitro mutagenesis.
Cloned genes can be altered by many in vitro mutagenesis procedures, which can lead
to the introduction of deletions, insertions or single nucleotide alterations. The most common method
of mutagenesis is the use of synthetic oligonucleotides to generate nucleotide changes in a DNA
sequence. In this procedure, a synthetic nucleotide bearing the desired mutation is used as a primer for
DNA synthesis. Newly synthesized DNA molecules containing the mutation can then be isolated and
826
Chapter 37: Recombinant DNA and Biotechnology
characterized. For example, specific amino acids of a protein can be altered in order to characterize
their role in protein function.
Variations of this approach, combined with the versatility of other methods for
manipulating recombinant DNA molecules, can be used to introduce virtually any desired alteration
in a cloned gene. The effects of such mutations on gene expression and function can then be
determined by introduction of the gene into an appropriate cell type. In vitro mutagenesis has thus
allowed detailed characterization of the functional roles of both the regulatory and protein-coding
sequences of cloned genes.
V. Introducing Mutations into Cellular Genes
Although the transfer of cloned genes into cells, particularly in combination with in vitro
mutagenesis, provides a powerful approach to studying gene structure and function, such experiments
fall short of defining the role of an unknown gene in a cell or intact organism. The cells used as
recipients in the transfer of cloned genes usually have normal copies of the gene in their chromosomal
DNAs already; after transfer, these normal copies continue to perform their roles in the cell.
Determining the biological role of a cloned gene therefore requires that the activity of the normal
cellular gene copies be eliminated. Several approaches can be used to either inactivate the
chromosomal copies of a cloned gene or inhibit normal gene function, both in cultured cells and in
transgenic mice.
Mutating chromosomal genes is based on the ability of a cloned gene introduced into
a cell to undergo homologous recombination with its chromosomal copy. In homologous
recombination, the cloned gene replaces the normal allele, so mutations introduced into the cloned
gene in vitro become incorporated into the chromosomal copy of the gene. In the simplest case,
mutations that inactivate the clone gene can be introduced in place of the normal gene copy in order
to determine that gene’s role in cellular processes.
Recombination between transferred DNA and the homologous chromosomal gene
occurs frequently in yeast, but is a rare event in mammalian cells. Thus, inactivation of mammalian
cells by this approach is technically difficult. Possibly because the genomes of mammalian cells are so
much larger than those of yeasts, most transfected DNA that integrates into the recipient cell genome
does so at random sites by recombination with unrelated sequences. However, various procedures
have been developed to both increase the frequency of homologous recombination and to select and
isolate the transformed cells in which homologous recombination has occurred. It is feasible to
inactivate any desired gene in mammalian cells by this approach. Importantly, genes can be readily
inactivated in mouse embryonic stem cells, which can then be used to generate transgenic mice. These
mice can be bred to yield progeny containing mutated copies of the targeted gene on both homologous
827
Next Step MCAT Content Review: Biology and Biochemistry
chromosomes, so the effects of inactivation of a gene can be investigated in the context of the intact
animal. In addition, cells can be cultured from mouse embryos containing the mutated gene copies, so
the functions of target genes can also be studied in cell culture. The biological activities of thousands of
mouse genes have been investigated in this way, and such studies have been critically important in
revealing the roles of many genes in murine development.
Homologous recombination has been used to systematically inactivate, or knock out,
every gene in yeast. This resulted in a collection of genome-wide yeast mutants that is available for
scientists to use to study the function of any desired gene. Methods also now exist to conditionally
knock out genes in specific mouse tissues, allowing the function of a gene to be studied in a defined cell
type (for example, in a nerve or liver cell) rather than in all cells of the organism.
VI. Interfering with Cellular Gene Expression
As an alternative to gene inactivation by homologous recombination, a variety of approaches
can be used to specifically interfere with gene expression or function. One method that has been used
to inhibit expression of a desired target gene is the introduction of antisense nucleic acids into cultured
cells. RNA or ssDNA complementary to the mRNA of the gene of interest (referred to as the antisense
gene) hybridizes with mRNA and blocks its translation into protein. Moreover, the RNA-DNA
hybrids resulting from the introduction of antisense DNA molecules are usually degraded within the
cell. Antisense RNAs can be introduced directly into cells, or cells can be transfected with vectors that
have been engineered to express antisense RNA. Antisense DNA is usually in the form of short
oligonucleotides, which can either be transfected into cells, or in many cases, taken up by cells directly
from the culture medium.
Recently, RNA interference (RNAi) has emerged as an extremely effective and widely used
method for interfering with gene expression at the level of mRNA. As discussed in a previous chapter,
RNAi is a major regulatory mechanism used by cells to control expression at both the transcriptional
and translation level. When double-stranded RNAs are introduced into cells, they are cleaved into
short double-stranded molecules by the enzyme Dicer. These short double-stranded molecules, called
short interfering RNAs (siRNAs), then associate with a complex of proteins known as the RNAinduced silencing complex (RISC). Within this complex, the two strands of siRNA separate and the
strand complementary to the mRNA (the antisense strand) guides the complex to the target mRNA by
complementary base pairing. The mRNA is then cleaved by one of the RISC proteins. The RISCsiRNA complex is released following degradation of the mRNA and can continue to participate in
multiple rounds of mRNA cleavage, leading to effective destruction of the targeted mRNA.
RNAi has been established as a potent method for interfering with gene expression in C.
elegans, Drosophila, Arabidopsis, and mammalian cells, and provides a relatively straightforward approach
828
Chapter 37: Recombinant DNA and Biotechnology
to investigating the function of any gene whose sequence has previously been established. In addition,
libraries of double-stranded RNAs or siRNAs that cover a large fraction of genes in the genome are
used to screen C. elegans, Drosophila, and human cells to identify novel genes involved in specific
biological functions, such as cell growth or survival.
In addition to inactivating a gene or inducing degradation of an mRNA, it is sometimes
possible to interfere with the function of proteins within cells. One approach is to microinject
antibodies that block the activity of the protein against which they are directed. Alternatively, some
mutant proteins interfere with the function of their normal counterparts when they are expressed
within the same cell. (For example, they may compete with the normal protein for binding to its target
molecule.) Cloned DNAs encoding such mutant proteins (called dominant inhibitory mutants) can be
introduced into cells by gene transfer and used to study the effects of blocking normal gene function.
829
Next Step MCAT Content Review: Biology and Biochemistry
Chapter 37 Problems
Passage 37.1 (Questions 1-5)
Human embryonic stem cells (hES) have the capacity for pluripotency. As cells differentiate
from the embryonic stem cell state, this capacity is lost. Two experiments were conducted in order to
investigate whether a differentiated adult nucleus can de-differentiate, giving rise to a fully
reprogrammed nucleus.
Experiment 1
Three new populations of cells were created from a sheep embryo, a sheep fetus and the
mammary gland of an adult female sheep, or ewe. Nuclear contents from each of the three donor cell
populations were transferred into enucleated unfertilized eggs from a ewe in a process referred to as
somatic cell nuclear transfer (SCNT), and then implanted in recipient ewes. Prior to SCNT, donor
cells were induced to exit the growth phase and enter quiescence. The nuclear content donors and
recipients were different breeds. Eight ewes gave birth to live lambs; each lamb shared the
morphological characteristics of the breed from which the donors cells were derived, not that of the
recipient ewe. The scientists’ findings are summarized in Table 1.
Experiment 2
Specific diploid hES cells transfected to express genes for green fluorescent protein (HUES6GFP cells) were mixed with diploid human BJ fibroblast (BJ) cells containing a drug-resistance marker
for puromycin. The cells were cultured and colonies bearing one of two distinct morphologies
appeared. Each colony was screened for successfully fused hybrid (HUES6-GFP/BJ) cells. Once
selected, HUES6-GFP/BJ cells, which displayed an appearance consistent with that of (hES) cells,
were isolated and their DNA content analyzed.
Injection of HUES6-GFP/BJ cells into mice led to the formation of embryoid bodies (EBs)
and teratomas. Immunostaining revealed that both contained neuroectoderm-derived βIII-tubulin,
mesoderm-derived myosin, and endoderm-derived alpha-fetoprotein proteins. Introduction of a
transgenic reporter gene revealed that embryonic genes were reactivated in the HUES6-GFP/BJ cells.
Genome-wide transcriptional profiling demonstrated that expression of genes active exclusively in BJ
cells were suppressed in HUES6-GFP/BJ cells, despite the fact that DNA content analysis indicated
that no genetic information was lost in the fusion process.
830
Chapter 37: Recombinant DNA and Biotechnology
Table 1: Development of sheep embryos reconstructed from one of three different donor
cell types.
No. of fused
couplets (%)
No. recovered
from oviduct (%)
No. of morula/
blastocyst (%)
No. of
pregnancies/no.
of recipients (%)
No. of
live
lambs
Mammary
epithelium
277 (63.8)
247 (89.2)
29 (11.7)
1/13 (7.7)
1
Fetal
fibroblast
172 (84.7)
124 (86.7)
34 (27.4)/13 (54.2)
4/10 (40)
3
Embryoderived
385 (92.9)
231 (85.3)
90 (39)/36 (39)
14/27 (51.8)
4
1. Based on the results of the experiments described in the passage, which of the following statements
is most justifiable?
A. Differentiated cells in all stages of the cell cycle can be reprogrammed to an embryonic-like state
by transfer of nuclear contents into oocytes or by fusion with embryonic stem cells.
B. The introduction of transcription factors found exclusively in embryonic stem cells can induce
somatic cells to revert to an embryonic-like state.
C. Viable adult organisms capable of reproduction can be produced from the transfer of nuclear
contents from embryonic cells into oocytes.
D. Differentiation does not involve irreversible modification of cells that prevents their
reprogramming to the embryonic state.
2. In Experiment 1, in what stage of the cell cycle were the mammary epithelial cells immediately
prior to their introduction into the enucleated cell?
A. G0
B. G2
C. S
D. M
831
Next Step MCAT Content Review: Biology and Biochemistry
3. Genes associated with puromycin resistance were introduced into the fibroblasts used in
Experiment 1 via transduction. Which of the following methods of gene transfer is another example of
transduction?
A. A harmless strain of the bacteria Streptococcus pneumoniae made virulent after exposure to a heatkilled virulent strain
B. Transfer of genes for tetracycline resistance between Shigella and E. coli bacteria following direct
contact between them
C. The integration of a plasmid conferring kanamycin resistance by the bacteria Pseudomonas via a
DNA pump
D. A new genotype arising in recombinant Salmonella typhimurium strains due to the action of a
bacteriophage
4. Mesoderm-derived myosin was found during a histological examination of the teratoma sample
from Experiment 2. Which of the following structures in humans contains tissues also of mesodermal
origin?
A. retina
B. spinal cord
C. lung bronchi
D. ventricles of the heart
5. DNA content analysis was performed on successfully fused hybrid HUES6-GFP/BJ cells. How
many chromosomes, in terms of the human haploid number n, did the cells contain?
A. n
B. 2n
C. 4n
D. both 2n and 4n cells were observed
The following questions are NOT based on a descriptive passage.
6. In electrophoresis, nucleic acids are separated on the basis of differences in their:
A. positive or negative charge only.
B. size only.
832
Chapter 37: Recombinant DNA and Biotechnology
C. ability to hybridize to the stationary phase.
D. positive or negative charge and size.
7. The size of the human genome is approximately 3 x 106 kb. If the size of an insert for a BAC
vector is between 120-300 kb, then what is the maximum number of BAC clones required to produce
a genomic library of human DNA?
A. 1,000
B. 10,000
C. 25,000
D. 40,000
8. If E. coli is grown for several generations in media containing 15N, before the cells are transferred to
a media containing only 14N and grown for two additional generations, what will be the ratio of the
number of bacterial cells containing only 14N to cells containing both 14N and 15N?
A. 1:4
B. 1:2
C. 1:1
D. 2:1
9. How would you expect dactinomycin, an inhibitor of DNA-directed RNA synthesis, to affect
replication of the non-retroviral RNA virus, influenza?
A. Transcription of viral RNA would cease.
B. Translation of virally-encoded proteins would be affected.
C. The rate of viral entry into cells would increase.
D. There would be no change in viral replication.
10. What feature of a cloning vector would allow for the isolation of stably transfected mammalian
cells?
A. an endonuclease recognition sequence
B. a stable origin of replication
C. an antibody probe binding site
D. a selectable marker
833
Next Step MCAT Content Review: Biology and Biochemistry
Chapter 37 Solutions
1. D.
Experiment 1 demonstrates that adult mammary epithelium cells, under the circumstances in which
they were used, may be returned to an embryonic-like state of pluripotency; only a pluripotent cell,
when introduced into an oocyte, would be capable of giving rise to all of the tissues present in the lamb
born from nuclear transfer between adult mammary epithelial cells and recipient oocytes. Experiment
2 demonstrates a similar finding. According to passage information, HUES6-GFP/BJ cells are
consistent in appearance and behavior with hES cells and do not carry out the transcriptional program
found in un-hybridized BJ fibroblast cells. This suggests that differentiated adult cells do not undergo a
process of irreversible modification. If they are able to give rise to all of the cell types in an organism,
they must reversibly return to a de-differentiated, embryonic-like state in order to do so. All of this is
consistent with choice D, the correct answer. According to passage information, cells from which
nuclear contents were donated in Experiment 1 were induced to exit the growth phase and enter one
of quiescence prior to the nuclear transfer. It cannot be definitively concluded whether cells at any
phase of the cell cycle will behave as these quiescent cells did. Choice A is false and an incorrect
answer. Neither experiment introduced transcription factors, making choice B incorrect as well. In
Experiment 1, viable lambs were born, and while those lambs may well survive into adulthood and be
capable of reproducing, neither experiment provides evidence directly supporting such a finding. This
makes choice C false and an incorrect answer.
2. A.
According to passage information, the cells from which nuclear contents were donated in Experiment
1 were induced to exit the growth phase and enter one of quiescence prior to the nuclear transfer. This
implies that the cells were in G0, a period entered into by certain cells from the G1 phase, in which
cells are neither dividing nor preparing to divide. Many cells, including those in nervous and cardiac
muscle tissue, will remain in a quiescent state permanently after terminally differentiating. Other
mature cell types, including renal and hepatic parenchymal cells, may be induced to re-enter the cell
cycle. Choice A is true and is the correct answer.
3. D.
Transduction is the process by which genetic information is transferred between bacteria by a virus.
This is what is occurring in choice D, wherein a bacteriophage (a virus which injects its genome into
the cytoplasm of a bacterial host) transfers genetic information between Salmonella bacteria. Choice D
is therefore our correct answer. Transfection refers to the introduction of nucleic acids into cells by
834
Chapter 37: Recombinant DNA and Biotechnology
non-viral means. Both choices A and C are examples of transfection, which is sometimes also referred
to as transformation (particularly when the genetic information is introduced into non-animal
eukaryotic cells). Cells can be transfected by physical or chemical treatment or via a non-bacterial
biological carrier particle. Choice B is an example of bacterial conjugation – the transfer of genetic
material, usually in the form of a plasmid, between bacterial cells via indirect cell-to-cell contact or via
a pilus or pilus-like bridging structure.
4. D.
The mesoderm, ectoderm and endoderm are the three primary germ layers of the developing human
embryo. The mesoderm gives rise to smooth, cardiac and skeletal muscles, connective tissue and bone,
the endothelium of blood vessels, red and white blood cells, the kidneys and the adrenal cortex.
Therefore choice D is the correct answer. Tissue of the central and peripheral nervous system,
epidermis and hair, and the sensory structures of the eye arise from the ectoderm, making choice A
and B incorrect. The endoderm gives rise to nearly all of the GI tract and the cells lining the glandular
organs which are associated with it (including the pancreas and liver), respiratory tract as well as
portions of the bladder, thymus, thyroid and urethra. As a result, choice C is incorrect.
5. C.
Both the HUES6-GFP cells and BJ fibroblast cells that contributed to the fusion product were somatic
cells that should have contained the human diploid number of 46 (2n) chromosomes. The passage
indicates that no genetic information was lost in Experiment 2 during the formation of the HUES6GFP/BJ cells, meaning that 92 (4n) chromosomes should be detected in the analysis performed.
Choice C is the correct answer, while choices A, B and D are false.
6. B.
Due to their phosphate backbone, nucleic acids are negatively charged. During electrophoresis, species
are separated on the basis of their respective sizes, as reflected by their respective electrophoretic
mobilities. This makes choice B correct. When attempting DNA electrophoresis, the stationary phase
commonly used is agarose gel. When agarose cools, it solidifies as the individual molecules interact.
These interactions form a three-dimensional mesh structure with small pores that DNA can fit
through. The DNA has to find its way through the agarose. Smaller particles migrate faster than
larger ones and reach the positive anode first.
835
Next Step MCAT Content Review: Biology and Biochemistry
7. C.
Dividing the size of the human genome, by the smallest possible insert size indicates that at most (3 x
106)/(1.2 x 102) = 2.5 x 104 or 25,000 BAC clones would be needed to cover the entire genome.
8. C.
If E. coli is grown for several generations in media containing 15N, then when transferred to media
containing only 14N, all cells will contain only 15N. Following the semi-conservative method of DNA
replication, after one generation of growth, all cells will contain both 14N and 15N; after two
generations, half of the cells will contain only 14N and the other half will contain both 14N and 15N.
The ratio of the number of cells containing only 14N to cells containing both 14N and 15N is therefore
1:1.
9. D.
The question states that the influenza virus is non-retroviral, implying the virus replicates via RNAdirected RNA synthesis. Therefore, addition of an inhibitor of DNA-directed RNA synthesis such as
dactinomycin would not impact viral replication.
10. D.
When developing a stable transfection, researchers use selectable markers to distinguish transient from
stable transfections. Co-expressing the marker with the gene of interest enables researchers to identify
and select for cells that have the new gene integrated into their genome while also selecting against the
transiently transfected cells. For example, a common selection method is to co-transfect the new gene
with another gene for antibiotic resistance and then treat the transfected cells with a specific antibiotic
for selection. Only stably transfected cells with resistance to the antibiotic will survive in long-term
cultures, allowing for the selection and expansion of the desired cells. The hallmark of stably
transfected cells is that the foreign gene becomes part of the genome and is therefore replicated.
Descendants of these transfected cells, therefore, will also express the new gene, resulting in a stably
transfected cell line. As a result, only choice D is correct.
836