Download biomolecules (introduction, structure and functions)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Homologous recombination wikipedia , lookup

DNA profiling wikipedia , lookup

DNA replication wikipedia , lookup

DNA polymerase wikipedia , lookup

Helicase wikipedia , lookup

Microsatellite wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Replisome wikipedia , lookup

Helitron (biology) wikipedia , lookup

DNA nanotechnology wikipedia , lookup

Transcript
BIOMOLECULES (INTRODUCTION, STRUCTURE AND
FUNCTIONS)
Nucleic acids
1
2
Smita Rastogi 1 & U. N. Dwivedi 2
Lecturer, Department of Biotechnology, Integral University, Lucknow, India
Professor, Department of Biochemistry, University of Lucknow, Lucknow-226007, India
6-Jun-2006 (Revised 12-Jun-2007)
CONTENTS
Composition of nucleic acids
Generalized structural units of nucleic acids
Nucleosides
Nucleotides or Nucleoside 5’-triphosphates
Oligonucleotides
Nomenclature of nucleic acids
Structural levels of nucleic acids
Deoxyribonucleic acid (DNA)
Ribonucleic acid (RNA)
Structure
Types of RNA
Messenger RNA (mRNA)
Ribosomal RNA (rRNA)
Transfer RNA (tRNA)
Heterogeneous nuclear RNA (hnRNA)
Key words
Deoxyribose sugar, DNA, mRNA, Purines, Pyrimidines, Ribose sugar, rRNA, tRNA
The nucleic acids are the molecular repositories for genetic information and referred to as
the ‘Molecules of Heredity’. Although the name nucleic acid suggests their location in the
nuclei of cells, yet some of them are, however, also present in the cytoplasm. The nucleic
acids are the hereditary determinants of living organisms. They are the macromolecules
present in most living cells either in the free state or bound to proteins as nucleoproteins.
There are two types of nucleic acids, deoxyribonucleic acid (DNA) and ribonucleic acid
(RNA). Both are present in all plants and animals. Viruses also contain nucleic acids,
however, unlike a plant or animal has either RNA or DNA, but not both.
DNA is found mainly as a component of chromatin material of the cell nucleus whereas
most of the RNA (90%) is present in the cell cytoplasm and the remaining (10%) in the
nucleolus. Extranuclear DNA also exists, for e.g., in mitochondria and chloroplasts.
Composition of nucleic acids
Nucleic acids are biopolymers of high molecular weight with mononucleotide as their
repeating units. Each mononucleotide consists of the following:
(A)
Nitrogenous bases
(B)
Phosphoric acid
(C)
Pentose sugars
(A)
Nitrogenous bases
Two types of major nitrogenous bases, which account for the base composition of DNA or
RNA, are found in all nucleic acids. These are:
a) Purine bases
b) Pyrimidine bases
The purine and pyrimidine bases found in nucleic acids are listed in Table 1 and their
structures are given in Fig. 1.
Table 1: Purine and pyrimidine bases in DNA and RNA
Name of
base
Purine or
pyrimidine
Molecular
formula
Molecular
weight (Da)
Properties
Adenine
(A)
Guanine
(G)
Cytosine
(C)
Purine
C5H5N5
135.15
White, crystalline
Purine
C5H5ON5
151.15
Pyrimidine
C4H5ON3
111.12
Thymine
(T)
Pyrimidine
C5H6O2N2
126.13
Uracil (U)
Pyrimidine
C4H4O2N2
112.10
Colourless,
crystalline
White, crystalline,
first isolated from
Guano (bird
manure)
White, crystalline,
first isolated from
thymus tissue
White, crystalline
Found in
DNA and /
or RNA
DNA and
RNA
DNA and
RNA
RNA and
DNA
DNA only
RNA only
2
Nitrogenous bases
Pyrimidines
H
H
O
N
H
CH3
N
O
N
O
N
N
H
H
Thymine
Cytosine
In RNA, Uracil is present instead of Thymine
O
H
N
O
N
H
Uracil
H
H
N
O
N
N
H
N
N
N
N
H
N
H
N
H
Adenine
N
H
Guanine
Fig. 1: Structure of purine and pyrimidine bases
Both the purine and pyrimidine bases are planar molecules, owing to their π-electron
clouds. Purine and pyrimidine bases are hydrophobic and relatively insoluble in water at the
near neutral pH of cell. Purines can exist in syn or anti forms; pyrimidines can exist in anti
form because of steric interference between the sugar and carbonyl oxygen at C-2 of
pyrimidine.
Besides, the major nitrogenous bases, some minor bases also called modified nitrogenous
bases (purines and pyrimidines) also occur in polynucleotide structures.
Some naturally occurring forms of modified purines are hypoxanthine, xanthine, uric acid,
6-methyladenine (6-Me), 6-dimethyladenine (6-DiMe), 6-N-isopentenyladenine (6-IPA), 13
methylguanine (1-MeG), 2-dimethylguanine (2-DiMeG). Among the modified purines,
some are found in tRNA (described later). Methylation is the most common form of purine
modification in microorganisms. The presence of such methylated purines is also suggested
in plant genomes.
Some naturally occurring forms of modified pyrimidines (e.g. 5,6-dihydrouracil,
pseudouracil, 4-Thiouracil etc.) are common in tRNA (described later). Other examples
include 5-methylcytosine (5-MeC) and 5-hydroxymethylcytosine. The 5-methylcytosine is
a common component of higher plant and animal DNA. Infact up to 25% of the cytosine
residues of plant genome are methylated. The DNA of plants is richer in 5-MeC than the
DNA of animals. The DNA of the T-even bacteriophages (T2, T4) of E. coli has no cytosine
but instead has 5-hydroxymethylcytosine and its glucoside derivatives.
(B)
Phosphorus
Phosphorus, present in the backbone of nucleic acids, is a constituent of phosphodiester
bond that links the two sugar moieties. The molecular formula of phosphoric acid is H3PO4.
It contains three monovalent hydroxyl groups and a divalent oxygen atom, all linked to the
pentavalent phosphorus atom.
(C)
Sugar
Both DNA and RNA contain five-carbon ketose sugar, i.e. a pentose sugar. The essential
difference between DNA and RNA is the type of sugar they contain. RNA contains the
sugar D-ribose (hence called ribonucleic acid, RNA) whereas DNA contains its derivatives
2’-deoxy-D-ribose, where the 2’-hydroxyl group of ribose is replaced by hydrogen (hence
called deoxyribonucleic acid, DNA). Sugars are always in closed ring β-furanose form in
nucleic acids and hence are called furanose sugars because of their similarity to the
heterocyclic compound furan. The structure of pentose sugars present in DNA and RNA are
shown in Fig. 2.
Sugars
OH
5'
CH
2 O
HC 4'
3'
HC
OH
OH
1' CH
2'
CH
OH
D-Ribose
OH
5'
CH
OH
2 O
HC 4'
3'
HC
OH
1' CH
2'
CH
H
2'-Deoxyribose
Fig. 2: Structure of sugars present in nucleic acids
4
The structural difference in the sugars of DNA and RNA, though minor, confers very
different chemical and physical properties upon DNA than RNA. RNA is much stiffer due
to steric hindrance and more susceptible to hydrolysis in alkaline conditions, perhaps
explaining in part why DNA has emerged as the primary genetic material.
Sugar, along with phosphate performs the structural role in nucleic acids.
Generalized structural units of nucleic acids
Generalized structural units of nucleic acids are indicated in the Scheme 1.
Components of Nucleic acids:
Deoxyribose (or ribose)
Phosphoric acid
Base
Nucleoside
Nucleotide
Polynucleotide or Nucleic acid
(DNA / RNA)
Scheme 1: Generalized structural units of nucleic acid
(A)
Nucleosides
The nucleosides are compounds in which nitrogenous bases (purines and pyrimidines) are
conjugated to the pentose sugars (ribose or deoxyribose) by a β-N-glycosidic linkage. These
consist of a base joined to a pentose sugar at position C1′. The sugar C1′ carbon atom is
joined to the N1 atom of pyrimidine and the N9 atom of purine. This represents a β-Nglycosidic bond. Thus, the purine nucleosides are N-9 glycosides and the pyrimidine
nucleosides are N-1 glycosides. These are stable in alkali. The purine nucleosides are
readily hydrolyzed by acid whereas pyrimidine nucleosides are hydrolyzed only after
prolonged treatment with concentrated acid. The nucleosides are generally named for the
particular purine and pyrimidine present. Nucleosides possessing ribose are called
ribonucleosides
(riboside)
and
those
containing
deoxyribose
are
called
deoxyribonucleosides (deoxyriboside). The nomenclature of nucleosides differs from that of
the bases.
5
In case of pseudouridine, which is otherwise identical to uracil, differs in the point of
attachment to the ribose to the base. In case of pseudouridine, base is attached to sugar
through C5 of base as opposed to that in case of uridine, where the attachment of base to
sugar is through N1 (structure described later in section of tRNA).
Two nucleoside analogues, 3′-azidodeoxythymidine (AZT) and 2′, 3′-dideoxycytidine
(DDC), have found therapeutic use for the treatment of acquired immune deficiency
syndrome (AIDS) patients.
(B)
Nucleotides or Nucleoside 5’-triphosphates
These are phosphate esters of nucleosides i.e. nucleosides form nucleotides by joining with
phosphoric acid. Esterification can occur at any free hydroxyl group, but is most common at
the 5′ and 3′ positions in sugars. The phosphate residues are joined to the sugar ring by a
phosphomonoester bond and several phosphate groups can be joined in series by
phosphoanhydride bonds. These occur either in the free form or as subunits in nucleic acids.
The phosphate is always esterified to the sugar moiety.
The trivial names of purine nucleosides end with the suffix –sine, and those of pyrimidine
nucleosides end with suffix –dine.
In addition to their role as structural components of nucleic acids, nucleotides also
participate in a number of other functions as described below:
”
Energy carriers: Nucleotides represent energy rich compounds that drive metabolic
process, especially biosynthetic, in all cells. Hydrolysis of nucleoside triphosphate provides
the chemical energy to drive a wide variety of cellular reactions. ATP is the most widely
used for this purpose. UTP, GTP, CTP are also used. Nucleoside triphosphate also serves as
the activated precursors of DNA and RNA synthesis. The hydrolysis of ester linkage
(between ribose and α-phosphate) yields about 14 kJ / mol under standard conditions,
whereas hydrolysis of each anhydride bond (between α-β and β-γ phosphates) yields about
30 kJ / mol. ATP hydrolysis often plays an important thermodynamic role in biosynthesis.
”
Enzyme cofactors: Many enzyme cofactors include adenosine in their structure, e.g.,
NAD, NADP, FAD.
”
Chemical messengers: Some nucleotides act as regulatory molecules and serve as
chemical signals or secondary messengers, key links in cellular systems that respond to
hormones and other extracellular stimuli and lead to adaptive changes in cells interior. Two
hydroxyl groups can be esterified by the same phosphate moiety to generate a cyclic AMP
(cAMP, adenosine 3’-5’ cyclic phosphate) or cyclic GMP (cGMP, guanosine 3’-5’ cyclic
phosphate).
(C)
Oligonucleotides
Oligonucleotides are polymers containing <100 nucleotides. These nucleotides are linked
by phosphodiester bond as shown in Fig. 3.
6
5' CH
2 O
1' CH
2'
CH
HC 4'
HC
3'
O
O
Base
_
O Phosphodiester bond
P
O
5'CH
2
HC 4'
3'
HC
O
Base
1' CH
2'
CH
O
Fig. 3: Phosphodiester linkage
The oligonucleotides occur naturally and are used as primers during DNA replication and
for various other purposes in the cell. Synthetic oligonucleotides can be made by chemical
synthesis and are essential for many lab techniques, e.g., DNA sequencing, PCR, in situ
hybridization, nucleic acid probe, nucleic acid hybridization, gene therapy.
The polymers containing >100 ribonucleotides or deoxyribonucleotides are called RNA and
DNA (nucleic acids), respectively.
Nomenclature of nucleic acids
”
Direction: By convention, single strand of nucleic acid is always written with
the 5’ end at the left and 3’ end at the right i.e. in 5’ → 3’ direction.
”
Sugar: In the chemical nomenclature, the carbon atoms of sugars are designated
by primed numbers i.e. C-1’, C-2’, C-3’ etc. to avoid confusion with the base numbering
system.
”
Base: The various atoms in the bases lack the prime (‘) sign and are designated
by the cardinal numbers, i.e. 1, 2, 3 etc. By convention, the N, C, O atom attached directly
7
to ring is numbered 2, 3, 7 etc., but the exocyclic atom (not within the ring structure) is
denoted as the atom with ring position as superscript to which it is attached e.g. Amino N
attached to C-6 in adenine is N6. Bases are represented by single letters, such as adenine is
represented as A, guanine as G, cytosine as C, thymine as T and uracil as U.
”
Nucleosides and nucleotides: While names of nucleosides and nucleotides are
generally derived from the corresponding bases, there is one exception to this rule: the base
corresponding to the nucleoside called inosine (and the derived nucleotides) is called
hypoxanthine.
”
Short hand notation: In short hand notation of nucleotides, phosphate group is
symbolized by P, deoxyribose a vertical line from C1’ at top to C5’ at bottom. The
connecting lines between nucleotides, which pass through P, are drawn diagonally from the
middle (C3’) of deoxyribose of one nucleotide to bottom (C5’) of next (Fig. 4).
Short hand notation for pTATGC
5' terminus
T
A
3'
5'
T
3'
P
P
3' terminus
C
3'
3'
P
5'
G
P
P
5'
5'
5'
Fig. 4 Short hand notation for a sequence 5′TATGC3′
The nucleoside and nucleotide derivatives of deoxyribose are distinguished by prefix ‘d’.
Where clarity is especially important, ribonucleosides and ribonucleotides can similarly be
identified with the prefix ’r’, e.g. ATP = rATP.
A second short hand notation is used to discriminate between 5’ and 3’ phosphates, with 5’phosphate placed before the base (e.g. pA is adenosine 5’-monophosphate) and 3’phosphates placed after the base (e.g. Ap is adenosine 3’-monophosphate). The deoxy
prefix can be omitted from the names of thymidine derivatives because, as a predominantly
DNA specific base, it is usually evident that sugar is deoxyribose. However, the full
nomenclature is preferred for the sake of convention and because thymine is a minor base in
RNA (thymine exists as modified base at various places, most notably in the TΨC loop of
every tRNA; note that thymine is 5-methyl uracil). Where context is obvious, both DNA
and RNA sequences are represented as a single series of bases. The ambiguous bases are
represented by the single letter representations as shown in Table 2.
8
Table 2: Ambiguous bases represented by the single letter representations
S. No.
Single letter
representation
1
2
3
4
5
6
7
8
9
10
R
Y
K
M
S
W
B
D
H
V
Purine / pyrimidine
represented by single letter
A / G (any PuRine)
C / T / U (any pYrimidine)
G / T (Keto)
A / C (aMino)
G / C (strong – three bonds)
A / T (weak – two bonds)
G, T, C (i.e. not A)
G, T, A (i.e. not C)
A, C, T (i.e. not G)
A, C, G (i.e. not T)
Structural levels of nucleic acids
Nucleic acids possess following structures:
(a)
Primary structure
The nature, properties and function of the two nucleic acids (DNA and RNA) depend on the
exact order of the purine and pyrimidine bases in the molecule. This sequence of specific
bases is termed as the primary structure. Thus, primary structure of nucleic acid is its
covalent structure and nucleotide sequence.
(b)
Secondary structure
The term secondary structure relates to regions of regular conformation of the chain,
stabilized by regular, repeating interactions (e.g. double helix of DNA). Thus, any regular,
stable structure taken up by some or all of the nucleotides in a nucleic acid can be referred
to as secondary structure.
Nucleic acid secondary structures are generated by two kinds of noncovalent interactions
between bases. The secondary structure of DNA is characterized by intermolecular base
pairing to generate double stranded or duplex molecules. Watson and Crick base pairs form
the basis of secondary structure interactions in nucleic acids as well as explaining
Chargaff’s rule. Secondary structures in RNA, which exist primarily in single stranded
form, generally reflect intramolecular base interactions. Thus, the secondary structures arise
due to following interactions:
” Complementary base pairing: It involves stable and specific configurations of Hbonds between bases in DNA. It is the predominant force causing nucleic acid strands to
associate. The molecular basis of Chargaff’s rule is complementary base pairing
between A-T and between G-C in double stranded DNA. Chargaff’s rule was later
explained by double helical structure described by Watson and Crick. G:C with three Hbonds are more stable than A:T (or A:U).
9
” Base stacking: The structures are stabilized by hydrophobic interactions between
adjacent bases brought about by electrons in π rings. It is these π-π interactions, which
are described as base stacking forces.
” Alternative forms of base pairing: Watson-Crick base pairs (A: T and G:C) are
predominant in the structure and function of nucleic acids. However, there are 28
possible arrangements of at least two H-bonds between bases, which provide the basis
for a diverse set of interactions. The most significant to these alternative configurations
are the Hoogsteen base pairs, which contribute to tRNA structure and allow the
formation of triple helices. A modification to Watson-Crick base pairs is the Wobble
pairs, which allow bases in the 5’-anticodon position of tRNA to pair ambiguously with
the mRNA. The Wobble base pairs are formed because bases are offset from their
normal Watson-Crick positions and one of the H-bonds is lost.
” Intramolecular base pairing: In RNA and single stranded regions of DNA (nonduplex DNA), secondary structure is determined by intramolecular base pairing. Since
cellular DNA is usually present as a duplex, the bases are available for intramolecular
interactions only rarely. Conversely intramolecular secondary structures are abundant in
cellular RNA and underlie their functional specialization. The major classes of
intramolecular nucleic acid secondary structures are bulges, bulge loops, bubbles,
hairpins, stem loops, panhandle, cruciform. Lariats are often classified as secondary
structures, but because they are formed by the covalent bonds joining nucleotides, they
are strictly primary structures.
(C)
Tertiary structure
The complex folding of large chromosomes within eukaryotic chromatin and bacterial
nucleoids is generally considered tertiary structure. Thus, tertiary structures of nucleic acid
reflect interactions, which contribute to overall 3D shape.
(D)
Quaternary structure
In many structures, nucleic acid interacts in trans (e.g. the ribosome and spliceosome) and
this may be considered a quaternary level of nucleic acid structure. Nucleic acids also
interact with an enormous number of proteins (e.g. genome structural proteins, transcription
factors, enzymes, splicing factors). Many of these proteins have a significant effect on DNA
or RNA conformation. Interactions with proteins may be general or sequence specific and
may involve subtle or overt changes in structure. The restriction enzymes EcoRI and
EcoRV, for e.g., both introduce a pronounced kink in the DNA at their recognition sequence
which may facilitate their endonucleolytic activity. Proteins of the high mobility group
(HMG) class appear specifically to bend DNA in order to facilitate interactions between
components bound at distant sites.
Deoxyribonucleic acid (DNA)
DNA is the genetic material in all organisms, except few viruses where RNA acts as the
genetic material e.g. retroviruses. In prokaryotic cells, DNA occurs in the cytoplasm and is
the only component of the chromosome. In eukaryotic cells, DNA is largely confined to the
nucleus and is the main component in chromosome. It is combined with simple proteins to
form deoxyribonucleoproteins (DNP). A small quantity of DNA also occurs in some
cytoplasmic organelle such as mitochondria and chloroplast. This extranuclear DNA is
naked as in prokaryotic DNA. The DNA content is fairly constant in all the cells of a given
species. Just before cell division, however, the amount of DNA is doubled. The gametes
10
have half the amount of DNA as they contain half the number of chromosomes. The amount
of DNA per nucleus is constant in all body cells of a given species. Mirsky and Vendrely
estimated that there is some 6 x 10-9 mg of DNA per nucleus in diploid somatic cells of
mammals and 3 x 10-9 mg of DNA per nucleus in haploid gametes (eggs and sperms).
(A)
Evidence that DNA is the genetic information carrier
In 1928, Frederick Griffith made a startling discovery. He injected mice with a mixture of
live ‘R’ and heat-killed ‘S’ pneumococci. The virulent (disease causing) form of the
pneumococcus (Diplococcus pneumoniae), a bacterium that causes pneumonia, is
encapsulated by a gelatinous polysaccharide coating that contains the binding sites (known
as O-antigens) through which it recognizes the cells it infects. Mutant pneumococci that
lack this coating, because of a defect in an enzyme involved in its formation, are not
pathogenic. The virulent (pathogenic) and non-virulent (non-pathogenic) pneumococci are
known as the ‘S’ and ‘R’ forms, respectively, because of the smooth and rough appearances
of their colonies in culture. This experiment resulted in the death of most of the mice. More
surprisingly yet was that the blood of the dead mice contained live ‘S’ pneumococci. The
dead ‘S’ pneumococci initially injected into the mice had somehow transformed the
otherwise innocuous ‘R’ pneumococci to the virulent ‘S’ form. Furthermore, the progeny of
the transformed pneumococci were also ‘S’; the transformation was permanent. Eventually,
it was shown that transformation could also be made in vitro by mixing ‘R’ cells with a cellfree extract of ‘S’ cells. This experiment could not, however, explain that DNA is the
transforming principle. The experimental results are depicted in Fig. 5.
DNA is Genetic Material
Tissue analyzed
Living S recovered
Fig. 5: Experimental evidence to establish that DNA is the genetic material
11
In 1944, Ostwald Avery, Colin MacLeod and Maclyn McCarty, after a 10-year
investigation, extended Griffith’s experiment and reported that transforming material is
DNA. The conclusion was based on the observation that the laboriously purified
transforming material had all the physical and chemical properties of DNA, contained no
detectable protein, was unaffected by enzymes that catalyze the hydrolysis of proteins and
RNA, and was totally inactivated by treatment with an enzyme that catalyzes the hydrolysis
of DNA. DNA must therefore be the carrier of genetic information.
In 1952, Alfred Hershey and Martha Chase performed ‘Blender experiment’ to demonstrate
that DNA is genetic material in bacteriophage. Bacteriophage T2 was grown on E. coli in a
medium containing the radioactive isotopes 32P and 35S. They labeled the phage capsid,
which contains no P, with 35S, and its DNA, which contains no S, with 32P. These phages
are added to an unlabeled culture of E. coli. After sufficient time allowed for the phages to
infect the bacterial cells, the culture was agitated in a blender so as to shear the phage heads
from the bacterial cells. This rough treatment neither injured the bacteria nor ghosts were
separated from the bacteria (by centrifugation), the ghosts were found to contain most of the
35
S, whereas the bacteria contained most of the 32P. Furthermore, 30% of the 32P appeared in
the progeny phages but only 1% of the 35S did so. Hershey and Chase therefore concluded
that only the phage DNA was essential for the production of progeny and the protein coat
served only as a protective shell. DNA, therefore, must be the hereditary material. The
details of the experiment are outlined in Fig. 6.
DNA is a Genetic Material
Phage
T2
Blender Experiment
Fig. 6: Hershey-Chase experiment to demonstrate that DNA is the genetic material
12
(B)
Size and shape of DNA in prokaryotic and eukaryotic cells
DNA is one of the largest known macromolecules. DNA molecules may be of two types:
linear and circular. Linear DNA is found in the nuclei of eukaryotic cells. It exists in
association with proteins. Circular DNA is found in prokaryotic cells and in mitochondria
and chloroplast (plastids) of eukaryotic cells. It is naked being without a protein coat.
Table 3 below lists the dimensions of the various viral, bacterial and eukaryotic DNA
molecules. A perusal of the table indicates that even the smallest DNA molecules are highly
elongated. For instance, the DNA from polyoma virus contains 5100 base pairs and has a
contour length of 1.7 µm.
Table 3: Dimensions of certain DNA molecules
Number of base pairs *Length (in µm) Molecular weight
Organism
(in thousands or kb)
Viruses
Polyoma virus or SV40
λ phage
T2 phage
Vaccinia
Bacteria
Mycoplasma
E. coli
Eukaryotes
Yeast
Drosophila
Human
5.1
48.6
1.7
17
3.1 x 106
31 x 106
166
190
56
65
122 x 106
157 x 106
760
4000
260
1360
504 x 106
2320 x 106
13500
165000
2900000
4600
56000
990000
-
* 1 µm of double helix = 2.94 x103 base pairs = 1.94 x 106 D
(C)
DNA structure
(a)
Chargaff’s Equivalence Rule
In 1950, E. E. Chargaff formulated important generalizations about DNA structure based on
the data of quantitative chromatographic methods for separation and quantitative analysis of
four bases in hydrolysates of DNA specimen isolated from different organisms. These
generalizations are called Chargaff’s equivalence rule. These include:
™
Base composition of DNA varies from one species to another.
™
DNA specimens isolated from different tissues of the same species have the same
base composition.
™
The base composition of DNA in a given species does not change with age,
nutritional state, or changes in environment.
™
Purines (A, G) and pyrimidines (T, C) are always equal such that amount of A is
equal to T and the amount of G is always equal to C, i.e. A=T, G=C (Molar
13
™
equivalence of few bases).
Base ratio A+T / G+C may vary from one species to other, but is constant for a
given species. This ratio can be used to identity the source of DNA and can
sometimes help in classification.
The deoxyribose sugar and phosphate components occur in equal proportions.
(b)
Double helical structure of DNA (Watson-Crick model) (B-DNA)
™
In 1953, J. D. Watson and F. H. Crick postulated precise 3-D model of DNA structure,
based on the X-Ray data of Franklin and Wilkins and the base equivalence observed by
Chargaff. This model accounted for many of the observations on the chemical and physical
properties of DNA and also suggested a mechanism for accurate replication of genetic
information. The Watson-Crick model of DNA structure proposed the following:
™
DNA contains two polynucleotide chains that are coiled in helical fashion around
the same axis in right handed or counterclockwise direction, thus forming a double
helix. The two chains or strands are antiparallel i.e. their 3’, 5’- internucleotide
phosphodiester bridges run in opposite directions (as determined by nearest
neighbour analysis). These chains are complementary to each other. The antiparallel
orientation is a stereochemical consequence of the way that A and T and G and C
pair with each other. All the phosphodiester linkages have the same orientation
along the chain, giving each linear nucleic acid strand a specific polarity and distinct
5’ and 3’ ends. By definition 5’ end lacks a nucleotide at 5’ position, 3’ end lacks
nucleotide at 3’ position.
™
The backbone of helix consists of sugar and phosphate groups while bases are
perpendicular to the backbone, projecting inwards to the center. Purine and
pyrimidine bases are stacked inside the helix with their planes parallel to each other
and perpendicular to the helix axis. Backbone is found on the periphery of the helix
and is hydrophilic. Hydroxyl groups of sugar forms H-bonds with water. Phosphate
groups with pKa near zero are negatively charged at neutral pH and negative charges
are generally neutralized by ionic interaction with positive charges of protein, metals
and polyamines. Bases are hydrophobic and shielded from water. It means that
single stranded structure, in which the bases are exposed to aqueous environment, is
unstable. Hence DNA is double helix. DNA double helix is held together by two
forces: H-bonding of complementary base pairs and hydrophobic interactions.
™
A base pair consists of a purine and a pyrimidine. Moreover, a specific purine pairs
with a specific pyrimidine owing to a perfect match between hydrogen donor and
acceptor sites on the two bases. The bases of one strand are paired in the same
planes with the bases of other strand. Base pairing is due to steric and H-bonding
factors. Base A is bonded with T by two H-bonds (double bond) and G is bonded to
C by triple H-bond. Only A and T and also G and C have the proper spatial
arrangements to form correct H-bonding. This is the concept of specific base
pairing. The allowed pairs are A-T and G-C which are precisely the base pairs
showing Chargaff’s equivalence in DNA. Thus, Watson-Crick double helix involves
not only the maximum possible number of H-bonded base pairs but also those pairs
giving maximum fit and stability. The individual H-bond is weak in nature, but, as
in the case of proteins, a large number of them involved in the DNA molecule confer
stability to it. However, the stability of DNA is primarily a consequence of van der
Waals forces and hydrophobic (base stacking) interactions between the planes of
stacked bases. Thus, H-bonding is specific and is responsible for complementarity of
two strands, while hydrophobic interactions [(π-π) stacking interactions between
14
™
™
adjacent bases] are non-specific and are responsible for stability of the
macromolecule. The nucleic acid strands tend to stick together even in the absence
of specific base pairing, although the specific interactions make the association
stronger.
DNA was found to possess two periodicities, a major one of 0.34 nm and a second
one of 3.4 nm. To account for the 0.34 nm periodicity, Watson and Crick postulated
that the bases are stacked at a center-to-center distance of 0.34 nm from each other,
i.e. successive base pairs are 3.4 Å apart in the stack and are related by a rotation of
36°. DNA helix is about 20 Å in diameter. The helical structure repeats after 10
residues on each chain, i.e., at intervals of 34 Å. Thus, there are 10.5 nucleotide
residues in each complete turn of double helix to account for the secondary repeat
distance of 3.4 nm. The space available between the two sugar-phosphate chains of
DNA i.e. 20 Å (2 nm) can accommodate one purine and one pyrimidine but not two
purines, which would be too large and not two pyrimidines which would not be
close enough to form proper H-bonds.
The two helices are wound in such a way so as to produce two interchain spacings or
grooves, a major or wide groove (width 12 Å, depth 8.5 Å) and a minor or narrow
groove (width 6.0 Å, depth 7.5 Å). Thus, major groove is slightly deeper than minor
one. The two grooves arise because the glycosidic bonds of a base pair are not
diametrically opposite each other. The minor grove contains the pyrimidine O-2 and
the purine N-3 of the base pair; and the major groove is on the opposite side of the
pair. Potential H-bond donor and acceptor atoms line each groove. The major groove
displays more distinctive features than the minor groove. In these grooves, specific
proteins interact with sequences of DNA. Such double helices cannot be pulled apart
and can be separated only by the unwinding process. They are called as plectonemic
coils, i.e., coils that are interlocked about the same axis. The helical structure helps
in shielding of the bases from the environment, thereby protecting the genetic
information from physical and chemical attack.
The structural details of double stranded DNA as suggested by Watson and Crick are shown
in Fig. 7. Fig 7a depicts helical structure of DNA and Fig. 7b shows normal Watson-Crick
base pairing interactions.
Double helical structure explains the mechanism by which general information can be
accurately replicated. The complementarity of bases in and antiparallel directions of the two
chains of DNA molecule provide the basis for precise replication of DNA. Since the two
strands are structurally complementary to each other and thus contain complementary
information, the replication of DNA during cell division was postulated to occur by
replication of two strands, so that each parent strand serves as the template specifying the
base sequence of new complementary strand. The end result of such a process is the
formation of two daughter double-helical molecules of DNA, each identical to that of the
parent DNA and each containing one strand from the parent.
(c)
Local flexibility in DNA structure
The analysis of oligonucleotide crystals as opposed to fibers shows that there is great
variation in the helical parameters of molecules with diverse base sequences. This occurs
because different base sequences influence helical and torsional parameters to maximize the
stability of stacking and pairing interactions. B-DNA is particularly flexible in this respect
and different local configurations adapt to particular sequences. This indicates that DNA
15
probably does not exist in rigid conformational forms but may change smoothly between
different conformations punctuated by local polymorphisms such as bent DNA and helical
transitions (sudden transitions between different helical conformations within a single
molecule, e.g. B-Z transitions). DNA bending is an intrinsic property depending on stacking
interactions, which according to local sequence, may be isotropic (unbiased) or anisotropic
(bending in a specific direction). Intrinsic DNA bends occur in A-T rich runs and in repeats
of the sequence GGCC in step with helical periodicity. DNA bending can also be induced
by proteins (nucleic acid binding proteins) and by circularization (DNA topology). Induced
bending is necessary for DNA packaging in chromosomes and for replication,
recombination and transcription. Proteins may also recognize DNA that is bent in a certain
way (e.g. topoisomerase).
Fig 7a: Double helical structure of DNA
(d)
DNA topology
If the DNA molecule has free ends (e.g. a linear molecule), the two strands wind around
each other in the most energetically favorable manner and the molecule is said to be
relaxed. The number of times one-strand winds around the other in this relaxed state is the
duplex winding number. If extra twists are introduced into such a molecule and to make it
overwound, then the total number of helical turns – which is the linking number – exceeds
the duplex winding number. Conversely, if twists are removed from the molecule to make it
underwound, the duplex winding number exceeds the linking number. In either case, the
strands can rotate with respect to each other and return the molecule to its relaxed state. In a
closed circle, however, there are no free ends and the linking number is a topological
property - it can be changed only by breaking the circle open and not by deforming it. If
DNA is a closed circle becomes overwound or underwound, the only way to relax the
torsional strain thus produced is by supercoiling, where a twist is introduced into the helical
axis itself. Supercoiling is another form of nucleic acid tertiary structure, one involving the
effect of torsional stress upon shape rather than strand-strand interactions.
16
Fig 7b: Watson-Crick base pairing
The physiological significance of supercoiling is that unconstrained DNA is often
biologically inactive. Negative supercoiling is required for many essential processes:
replication, transcription and recombination included. Supercoiled DNA has stored energy,
which drives these reactions. In eukaryotes, which possess linear chromosomes, topological
constraints are introduced by organizing chromatin into loops with ends fixed by scaffold
proteins; nucleosomes introduce negative supercoils into eukaryote DNA.
(e)
Structural variants (helical conformers) of DNA
Watson-Crick structure of DNA is referred to as B-DNA (normal form). It is the
biologically important one and exists under physiological conditions. DNA is very flexible
in nature. Due to thermal fluctuation, bending, stretching and unpairing (melting) of strands
can occur. Structural variants of DNA may arise due to three reasons:
™
Difference in possible conformation of deoxyribose.
™
Rotation about contiguous bonds that make up the phosphodeoxyribose backbone.
™
Free rotation about C-1’-N-glycosyl bond (syn or anti).
The first investigations of DNA secondary structure demonstrated that alternative helical
conformations (conformers) formed at different humidities. Different forms have different
size and shape of grooves. The change in conformation alters the shape of the major and
minor grooves, potentially influencing the nature of protein-DNA interactions, thereby
affecting the regulatory property of DNA. Helical conformations reflect differences in the
various parameters such as gross morphological features, bond angles, base inclination,
displacement of the base pairs from the helical axis resulting from dehydration, helix
parameter as base pairs per helical turn and helical twist. However, the key properties of
DNA in different forms are not changed. The structural variants that have been well
characterized in crystal structures are:
17
(i)
A-DNA
Dehydration favours ‘A’ form. It does not occur under physiological conditions. It is
observed in dehydrated DNA fibers by X-Ray diffraction studies i.e. when relative humidity
is reduced below 75%. It is favoured in many solutions that are relatively devoid of water.
There is no evidence for its existence in cells. The reagents used to promote crystallization
of DNA tend to dehydrate it and thus most short DNA molecules tend to crystallize in Aform. The A-form is not confined to dehydrated DNA. Double stranded regions of RNA (as
in hairpins) and RNA-DNA hybrids adopt a double helical very similar to that of A-DNA.
The 2′-OH of ribose prevents RNA from forming a classic Watson-Crick B-helix because of
steric hindrance. In A-form, O-2′ projects outward away from other atoms. Under
physiological conditions, duplex RNA and RNA-DNA hybrids are thought to adopt A-form
structure because they are inherently less flexible than DNA. The A-form of DNA is less
soluble than B-form, which is why DNA which is overdried during plasmid preparation, is
difficult to dissolve.
(ii)
Z-DNA
Alexander Rich in 1984 discovered Z-DNA while solving the structure of CGCGCG. ZDNA is adopted by short oligonucleotides that have sequences of alternating pyrimidines
and purines. Early studies of oligonucleotides with alternating purine-pyrimidine sequences
revealed the left-handed helical conformation of Z-DNA. This structure is characterized by
alternating helical parameters and torsion angles with a 2-base pair periodicity, causing the
backbone of the helix to zig-zag (hence the name Z-DNA). Zig-zagging is thus a
consequence of the fact that the repeating unit is a dinucleotide (not a mononucleotide),
especially a sequence in which pyrimidine alternate with purines, e.g., alternating C and G
or 5-methyl cytosine and G residues. Although, alternating purine-pyrimidine tracts such as
oligo-dGdC and oligo-dAdC provide a good substrate for Z-DNA, this sequence specificity
is now known to be neither necessary nor sufficient for its formation. Methylation of C-5 of
cytosyl residues in alternating CG sequences (e.g. CGCGCG) facilitates the transition of BDNA to Z-DNA, because the added hydrophobic methyl groups stabilize the Z-DNA
structure. Z-DNA is formed when purine residue flip in syn conformation while alternating
pyrimidine is in anti conformation. Phosphate groups of backbone are closer to each other
as compared to that in A or B forms, hence high salt concentration is required to minimize
electrostatic repulsion between the backbone phosphates. It contains one deep helical
groove. The Z-DNA form occurs under physiological conditions in certain cases only. The
biological role of Z-DNA is uncertain, however, its existence graphically shows that DNA
is a flexible, dynamic molecule. Z-DNA structure tend to form in torsionally stressed DNA
and are stabilized by dehydration, they may play an important role in control of gene
expression. Fig. 8 depicts the common structural variants of DNA, i.e. A and Z, along with
B form of DNA. The general characteristics of A, B and Z DNA are summarized in Table 4.
(f)
Properties of DNA in solution
(i)
Acid-base properties
DNA is strongly acidic. The recurring secondary phosphate groups of DNA, which
constitute the bridges between adjacent mononucleotides have a rather low pK′ and are fully
ionized at any pH above 4. These phosphate groups are located on the outer periphery of the
double helix, exposed to water. They strongly bind divalent cations as Mg++ and Ca++, as
well as polycationic amines, spermine and spermidine, which are associated with the DNA
in many viruses and bacteria. The binding of the polyamines in the groove of double helical
18
DNA both stabilizes the DNA molecule and makes it more flexible.
A DNA
B DNA
Z DNA
Fig. 8: Common structural variants of DNA
Double helical DNA is maximally stable between pH 4.0 and 11.0 (physiological range).
Outside these physiological limits, DNA becomes unstable and unwinds. The stability of Hbonded base pairs of double helical DNA is a function of pH, since the H-bonding
properties of different bases depends on their ionic form, which in turn depends on pH.
(ii)
Light absorption
Typical absorption spectrum for DNA at pH 7.0 is represented in Fig. 9. As shown, DNA
molecule absorbs light energy strongly at 260 nm. This characteristic absorption maximum
is the property of its individual bases, purines and pyrimidines and their corresponding
nucleotides. A native intact molecule of DNA absorbs lesser light energy at 260 nm as
compared to free bases, as the bases are packed into a double helix of DNA.
(iii)
Viscosity
Because of the rigidity of the double helix and the immense length of DNA in relation to its
small diameter, even very dilute DNA solutions are highly viscous. Solution of DNA is
highly viscous at pH 7.0 and room temperature (25°C). Viscosity measurements are often
used to follow the course of unwinding and denaturation of duplex DNA molecules.
Viscosity decreases at extremes of pH and above 80°C and as the two strands separate.
There is another consequence of the immense length of DNA molecule. When they diffuse,
they sweep with them a relatively enormous volume of solution, more than 10000-fold
greater than their own volume. For this reason, DNA shows ideal behavior as a solute only
in extremely dilute solutions.
(iv)
Sedimentation behaviour
The sedimentation coefficient and molecular weight of DNA can be determined by
ultracentrifugal methods. Because of the extremely elongated nature of DNA molecule and
the high viscosity of DNA solutions, sedimentation measurements are carried out in a series
19
of low concentrations of DNA and the sedimentation coefficient extrapolated to zero DNA
concentration. The sedimenting boundary is usually detected by measuring the optical
absorbance at a wavelength of 260 nm, at which DNA strongly absorbs.
Table 4: General characteristics of three major forms of DNA
Conformation
B
S. No.
A
1
a
b
2
a
b
c
d
e
f
3
a
b
4
a
b
c
d
e
f
g
Conditions
Relative humidity
75%
Ions required / Salt Na+, K+,
concentration
Cs+ ions
Morphological characteristics
Shape
Broadest
Helical state
Right
Pitch (base pairs per turn) 11
Major groove
Deep,
narrow
Minor groove
Broad,
shallow
Helix diameter
~26 Å
Torsional parameters
Sugar
pucker C2’ endo
conformation
Glycosidic bond angle
Anti
Helical parameters
Displacement
Twist
Helix rise per base pair
Helix pitch
Base tilt normal to helix
axis
Inclination
Rotation per base pair
92%
Low
strength
Z
ion Very high salt
concentration
Intermediate
Right
10.5
Wide
Narrowest
Left
12 (= 6 dimers)
Flat
Narrow
~20 Å
Narrow and very
deep
~18 Å
C3’ endo
Alternating
Anti
Alternating anti /
syn
-4.4
33
2.6 Å
25.30 Å
20°
0.6
36
3.4 Å
35.36 Å
6°
3.2
-49 / -10
3.7 Å
45.60 Å
7°
22
+32.72°
-2
+34.61°
-7
-60° (per dimmer)
Molecular weights of DNA can also be obtained by comparing their rate of sedimentation in
a sucrose density gradient with the rate given by a DNA sample of known size and
sedimentation coefficient.
Equilibrium sedimentation in CsCl gradients is very widely used to determine the buoyant
density of DNA molecules. When a concentrated (8 M) CsCl solution is centrifuged to
equilibrium in a high gravitational field, the CsCl becomes distributed in a linear gradient
down the tube; at the top of 1 cm column the density of the solution is about 1.55 g cm-3 and
at the bottom about 1.8 g cm-3 or 1.8 g ml-1. When DNA is present during formation of
gradient, it concentrates into a stable band at a position at which its buoyant density is
exactly equal to the density of CsCl solution. The density of DNA can be calculated directly
or by comparison with the density of known standard DNA specimen centrifuged in the
20
Absorbance
same gradient. Single stranded DNA is denser in such a CsCl gradient than double stranded
DNA, which in turn is denser than proteins in general. RNA can be distinguished from
DNA since it is denser than either single stranded or double stranded DNA. Buoyant density
measurements also provide information on the base composition of DNA specimen, because
G + C base pairs, which are joined by three H-bonds, are more compact and dense than A-T
pairs, which are joined by only two H-bonds. The buoyant density of DNA in CsCl gradient
is a linear function of ratio of G-C to A-T pairs. The intact homogeneous DNAs of viruses
give very sharp bands, whereas random heterogeneous DNA fragments from cells of higher
animals give broad bands with a wide density range.
1.5
1
0.5
200
220
240
260
280
300
Wavelength (nm)
Fig. 9: The absorption spectrum of a DNA solution at pH 7.0
(g)
Denaturation
Double helical structure of DNA is maintained due to H-bonding between base pairs and
stacking interactions between successive bases. When either or both sets of forces are
interrupted, the native, double helical structure undergoes transition into a randomly looped
form, denoted as single stranded or denatured DNA. Thus, DNA double helix can be easily
separated by denaturation and rejoined (Fig. 10).
Denaturation of DNA
Double helical DNA
Partially unwound
(denatured) DNA
Separated strands of DNA
Fig. 10: Denaturation (melting) of DNA
21
”
Causes of denaturation or factors affecting denaturation
The unwinding and rewinding of DNA occur naturally in vivo during DNA replication and
transcription at regions rich in A-T. In vitro, following conditions lead to the denaturation of
DNA.
™ Extreme pH (titration with acid or alkali): Acidic and alkaline pH at which ionic
changes of the subsituents on the purine and pyrimidine bases can occur also leads
to denaturation of DNA. In acidic solutions (pH 2.0 to 3.0), at which amino groups
bind protons, the DNA helix is disrupted. Similarly, in alkaline solutions (pH 12),
the enolic hydroxyl groups ionize, thus preventing the keto-amino group H-bonding.
Acid or alkali leads to ionization of bases of DNA.
™ Heat (high temperature): Native DNA molecules usually denature within a very
small increment of temperature. The thermal denaturation of DNA is often
designated as melting. The separation of two strands of DNA upon denaturation is
shown in Fig. 10.
DNA samples from different cell types have characteristically different melting
temperatures. Tm increases in linear proportion with G-C base pair content, which have
three H-bonds and are thus stable than A-T pairs. The higher the content of G-C pairs, the
more stable the structure and more thermal energy required to disrupt it.
” Implications of denaturation
Denaturation significantly affects various properties of double stranded DNA. These
include:
™ Change in specific optical rotation: Native DNA exhibits a strong positive
rotation. Upon denaturation, optical rotation is highly decreased and becomes more
negative.
™ Change in absorption of ultraviolet light at 260 nm: As mentioned earlier, double
stranded DNA possesses an absorption maximum at 260 nm (Fig. 9). Upon
denaturation of DNA, an increase in light absorption at 260 nm is observed (physical
change). As compared to free bases, a native intact molecule of DNA absorbs lesser
light energy at 260 nm as their bases are packed into a double helix. Upon
denaturation, the bases in single strands are exposed and consequently a denatured
DNA molecule absorbs more light as compared to native DNA. The total light
absorption of fully denatured DNA is nearly equal to that of an equivalent number of
the corresponding free mononucleotides. This increase in absorption of light (up to
40%) occurs even though the amount of DNA remains the same. This phenomenon
is called hyperchromic effect. A single stranded DNA does not show hyperchromic
effect. Thus, this phenomenon can be used to distinguish single stranded DNA. Fig.
11a represents the characteristic melting curve of DNA, demonstrating
hyperchromic effect upon denaturation of double stranded DNA. The temperature at
the midpoint of melting curve is called melting temperature, defined as Tm. The
effect of temperature on absorbance at 260 nm and their relationship with strand
separation is evident in Fig. 11b. As the temperature increases, the absorbance also
increases till strand separation, after which the absorbance does not increase.
The percentage increase in light absorption at 260 nm produced by heating a native DNA
sample is directly related to its content of A-T base pairs, the higher the proportion of A-T
base pairs, the greater the increase in light absorption.
22
0.1
NA
dD
e
d
n
stra
gle
n
i
S
0.75
0.5
75
Double stranded DNA
80 Tm 85
1.4
1.2
1.0
dsDNA
90
Temperature ( 0C)
(a)
Relative value of A
Absorbance
at 260 nm
260
A 260 of bases = 1.80
Strand
separation
ssDNA
Appearance
of DNA
Tm
30 50 70 90 110
Temperature ( 0 C)
(b)
Fig. 11: (a) Typical melting curve of DNA; (b) Effect of change of temperature on
absorbance with respect to strand separation
”
Renaturation
When denatured DNA (melted DNA) is incubated at a temperature about 25°C below that at
which denaturation occurs, the two separated strands reassociate or reanneal to form a
duplex DNA molecule. It is called renaturation. The process of renaturation of denatured
DNA upon cooling is shown in Fig. 12. The strands separated upon denaturation (melting)
reanneal to form duplex DNA. Even in the absence of small stretches of DNA from one
strand, the strands reassociate with the bulging of non-complementary (missing) sequences.
Renaturation can be a one- or two-step process.
™ One-step process: If denaturation has proceeded to first stage, with few base pairs
still present (i.e. if about 12 or more residues are still united), the unfolded segments
of two strands will spontaneously rewind or anneal to form an intact duplex on
lowering the temperature or change of pH. They snap back to their native
conformation, which is the minimum-free energy form.
™ Two-step process: Upon denaturation, when two strands are completely separated,
then renaturation is much slower and occurs in two-step process. First step, called
nucleation reaction, is relatively slow step in which two strands ‘find’ each other by
random collisions and form a short segment of complementary double helix. Second
step, called zippering reaction, is faster when remaining unpaired bases successively
come to base pair and the two strands ‘zipper’ them together to form double helix.
(h)
Functions of DNA
™
DNA is the very basis of life and has five-fold role:
It carries hereditary characters from parents to offspring.
23
™
™
™
™
It enables the cell to maintain, grow and divide by directing the synthesis of
structural proteins.
It controls metabolism in the cell by directing the formation of necessary enzymatic
proteins.
It contributes to the evolution of the organism by undergoing gene mutations
(changes in the sequence of base pairs).
It brings about differentiation of cells during development. Only certain genes
remain functional in particular cell. This enables the cells having similar genes to
assume different structure and function.
Fig. 12: Renaturation of denatured DNA
Ribonucleic acid (RNA)
RNA is the only molecule known to have a role both in the storage and transmission of
information and in catalysis. RNA is synthesized from DNA in a process called
transcription. Chemically, RNA is very similar to DNA. The fundamental chemical
differences are:
™ RNA backbone contains ribose rather than the 2′-deoxyribose (i.e. ribose without the
24
OH group at 2′-position) present in the DNA. However, this slight difference has a
powerful effect on some properties of the nucleic acid, especially on its stability. Thus,
RNA is readily destroyed by exposure to high pH. Under these conditions, DNA is
stable, although the strands will separate, they will remain intact and capable of
renaturation when the pH is lowered again.
™ RNA contains uracil instead of thymine. Uracil has the same single-ringed structure as
thymine, except that it lacks the methyl group at C-5 position. The reason for the use of
uracil in RNA instead of thymine is probably that the uracil energetically less expensive
to produce than thymine. Moreover, in DNA, as uracil is readily produced by chemical
deamination of cytosine, so having thymine as the normal base makes detection and
repair of such incipient mutations more efficient. Thus, uracil is appropriate for RNA,
where quantity is important but lifespan is not, whereas thymine is appropriate for DNA
where maintaining sequence with high fidelity is crucial.
™ Mostly single stranded i.e. single polynucleotide chain. However, some viruses have
double stranded RNA (dsRNA) as their genetic material.
™ Single strand can fold back on itself having potentially much greater structural diversity
than DNA.
(A)
Structure
(a)
Primary structure
RNA is single stranded, long, unbranched macromolecule consisting of nucleotides joined
by 3′ → 5′ phosphodiester bonds. The number of nucleotides ranges from as few as 75 to
many thousands. In RNA, U replaces T, but since U has a similar chemical structure to T
and forms the same H-bonds with A, it hybridize according to general rules. Ubiquitous as
these interactions are, however, there are alternative base pairing schemes playing important
roles in the secondary and tertiary structures (described below).
(b)
Secondary, tertiary and quaternary structures
Despite being single stranded, RNA molecules often exhibit a great deal of double helical
character. This is because RNA chain frequently folds back on itself to form base paired
segments between short stretches of complementary sequences. In contrast to DNA, where
the secondary structure of DNA is characterized by intermolecular base pairing, in RNA,
the secondary structure generally reflects intramolecular base interactions. Such secondary
structure formation in RNA by intramolecular normal Watson-Crick base pairing (C:G and
A:U) is shown in Fig. 13a. If the two stretches of complementary sequence are near each
other, the RNA may adopt one of various stem loop structures in which the intervening
RNA is looped out from the end of the double helical segment as in a hairpin, a bulge or a
simple loop (Fig. 13b).
The single strands tend to assume a right-handed helical conformation dominated by base
stacking interactions, which are stronger between two purines than between a purine and
pyrimidine or between two pyrimidines. The purine-purine interaction is so strong that a
pyrimidine separating two purines is often displaced from the stacking pattern so that the
purines can interact. Secondary structures are important in regulation of gene expression.
The 3-D structures of many RNAs, like those of proteins, are complex and unique. Weak
interactions, especially base stacking interactions, play a major role in stabilizing RNA
structures, just as they do in DNA. Where complementary sequences are present, the
predominant double stranded structure is an A-form right handed double helix. The
25
presence of 2’-OH in the RNA backbone prevents RNA from adopting a B-form helix.
Rather, under physiological conditions, duplex RNA and RNA-DNA hybrids adopt an Aform structure because they are inherently less flexible than DNA.
Secondary structure formation in RNA
(a)
Hairpin
Bulge
Loop
(b)
Fig. 13: Double helical characteristics of RNA (hairpin double helix) (a)
Intramolecular base pairing forming secondary structure in RNA (b) Formation of
stem and loop / bulge structures in complementary and non-complementary regions,
respectively
As such, the minor groove is wide and shallow and hence accessible, but the minor groove
offers little sequence-specific information. Meanwhile, the major groove is so narrow and
deep that it is not very accessible to amino acid side chains from interacting proteins. Zform helices have been made in the lab (under very high salt or high temperature
26
conditions). The B-form of RNA has not been observed. Thus, the RNA double helix is
quite distinct from the DNA double helix in its detailed atomic structure and less well suited
for sequence-specific interactions with proteins (although some proteins do bind to RNA in
a sequence-specific manner).
A feature of RNA that adds to its propensity to form double helical structures is an
additional, non Watson-Crick base pair. This is the G : U base pair, which has H-bonds
between N3 of U and carbonyl on C6 of G and between the carbonyl on C2 of U and N1 of
guanine. Since G:U base pairs can occur in addition to the four conventional Watson-Crick
base pairs, RNA chains have an enhanced capacity of self complementarity (Fig. 14). Thus,
RNA frequently exhibits local regions of base pairing but not the long-range, regular
helicity of DNA.
G:U base pair
O
N
O
N
N
Ribose
H
N
H
O
N
Ribose
N
NH
G
2
U
Fig. 14: G:U base pair
Important additional structural contributions are made by H-bonds that are not part of
standard Watson-Crick base pairs, e.g., 2′-OH group of ribose can H-bond with other
groups. Some of these properties are evident in the structure of the tRNAphe of yeast or
ribozymes, whose functions, like those of protein enzymes, depend on their 3-D structures.
RNA secondary structures play a major role in gene expression and its regulation: base
pairing between rRNA and mRNA controls the initiation of protein synthesis, base pairing
between tRNA and mRNA facilitates translation, RNA hairpins and stem loops control
transcriptional termination, translational efficiency and mRNA stability. RNA-RNA base
pairing also plays a major role in the splicing of introns. Like DNA, RNA helical
conformation is modulated by local sequence character, but the relatively high percentage
of modified bases further adds to the variety of structures.
RNA can fold into complex structures involving tertiary interactions between strands, loops
and duplexes. For example, in tRNA there are base triples, sections of triple helix, stem
junctions (where two or more duplex regions are joined) and pseudoknots (where strands
interact with stem loops). This is because RNA has enormous rotational freedom in the
backbone of its non base-paired regions. Tertiary structure frequently involves
unconventional base pairing, such as the base triples and base backbone interactions seen in
tRNA (e.g. U:A:U base triples).
27
Interaction of RNA with ribosome, spliceosome, proteins may be considered as quaternary
structure. Proteins can assist the formation of tertiary structures by large RNA molecules,
such as those found in the ribosome. Proteins shield the negative charges of backbone
phosphates, whose electrostatic repulsive forces would otherwise destabilize the structure.
(B)
Types of RNA
RNAs have a broader range of functions and several classes are found in cells. On basis of
size, function and stability, RNAs are of three types: rRNA, mRNA and tRNA:
™ Messenger RNA (mRNA): These are intermediaries, carrying genetic information from
one or a few genes to a ribosome, where the corresponding proteins can be synthesized.
™ Ribosomal RNA (rRNA): These are components of ribosomes, the complexes that
carry out the synthesis of proteins.
™ Transfer RNA (tRNA): These are adapter molecules that faithfully translate the
information in mRNA into a specific sequence of amino acid.
The properties and functions of different types of RNAs is described below and also
summarized in Table 5.
(a) Messenger RNA (mRNA)
™ The mRNAs are intermediaries, carrying genetic information from DNA for protein
synthesis. The mRNA codes for polypeptide chain (s).
™ It is synthesized in the nucleus during the process of transcription. The sequence of
bases of mRNA strand so formed is complementary to that of the DNA strand being
transcribed. After transcription, mRNA passes into cytoplasm and then to ribosomes,
where it serves as a template for the sequential ordering of amino acids during the
biosynthesis of proteins. Some mRNA is also produced in mitochondria and chloroplast.
™ The mRNA forms only about 5% of total RNA. Although it makes very small part of
total RNA of cell, it occurs in many distinctive forms, which vary greatly in molecular
weight and base sequence.
™ It is very unstable. The mRNA is degraded by ribonucleases present in all cells.
™ The cellular concentration of mRNA generally indicates the level of gene expression.
(i)
Prokaryotic mRNA:
™ Prokaryotic mRNA is mainly polycistronic (Fig. 15). A single mRNA molecule
codes for two or more polypeptide chains i.e. contains multiple ORFs. The mRNA
contains a ribosome-binding site (RBS) referred to as Shine Dalgarno Sequence. It is
complementary to a sequence located near the 3’ end of one of the RNA
components, the 16S rRNA. RBS base pairs with 16S rRNA, thereby aligning the
ribosome with the beginning of mRNA. Some mRNAs lack RBS and have
translational coupling, e.g. 5′AUGA3′ has an overlapping sequence. The protein
coding region(s) of each mRNA is composed of a contiguous, non-overlapping
string of codons called open reading frame (ORF). ORF is a sequence of DNA
consisting of triplets that can be translated into amino acids starting from initiation
codon and ending with a termination codon. Each ORF specifies a single
polypeptide and starts and ends at internal sites within the mRNA, i.e. the ends of an
ORF are distinct from the ends of mRNA. Translation starts at 5’ end of ORF. First
codon of an ORF is called start codon. In bacteria, it is usually 5’-AUG-3’. Some
also have 5’-GUG-3’ or 5’-UUG-3’. Last codon of ORF where translation stops is
28
stop codon. There are UAG, UGA and UAA.
Table 5: Different types of RNAs and their properties and functions
Function
Number Stability
Relative Sediment- Molecular
weight
of
amount ation
nucleotides
coefficient
(%)
mRNA
15
Heterogeneous 400-4000 Unstable (in Carry
(mammal) prokaryotes genetic
half life is information
few seconds from DNA
to 2 min; in for
eukaryotes assembly of
half life is amino acids
few hours to on
ribosomes
one day)
for protein
synthesis
tRNA
15
4S
2.5 x 104
73-93 Quite stable Act as
specific
in
prokaryotes; carrier of
somewhat activated
less stable amino acids
to specific
in
eukaryotes sites on
protein
synthesizing
templates
6
Most stable Ribosomal
rRNA
80
28S
1.5 x 10
4700
5
assembly;
form of
(eukaryote)
18S
7.8 x 10
1900
4
provide
RNA
5.8S
4.5 x 10
160
4
specific
5S
3.5 x 10
120
6
sequence to
rRNA
80
23S
1.2 x 10
2900
6
which
(prokaryote)
16S
0.55 x 10
1540
mRNA bind
5S
3.6 x 104
120
S. Type
No.
1
2
3
4
Fig. 15: Prokaryotic polycistronic message
29
™ In prokaryotes, half-life of mRNA is few seconds to 2 minutes.
(ii) Eukaryotic mRNA
™ Eukaryotic mRNA is monocistronic (Fig. 16). A single mRNA codes for single
polypeptide chain i.e. contains single ORF. There is no RBS. Start codon is AUG.
Fig. 16: Eukaryotic monocistronic message
™ In eukaryotes, half-life of mRNA is few hours to one day.
™ The primary transcript for a eukaryotic mRNA typically contains sequences
encompassing one gene, although the sequences encoding the polypeptide may not
be contiguous. Non-coding tracks that break up the coding region of the transcript
are called introns and the coding segments are called exons. In a typical process
called splicing, the introns are removed from the primary transcript and the exons
are joined to form a continuous sequence that specifies a functional polypeptide
™ The mRNAs are transcribed as large transcripts (pre-mRNA) from DNA. The premRNA has same organization as the gene. The primary transcript (also called
heterogeneous nuclear RNA; hnRNA) is much larger than mRNA, very unstable and
has much greater sequence complexity. Primary transcript undergoes splicing to
form mature mRNA, which is 10-100 times smaller than the primary transcript.
™ Most eukaryotic mRNAs have 5’ cap, a residue of 7-methyl guanosine [modified G
base (m7G)] linked to the 5′ terminal residue of mRNA through an unusual 5′ → 5′triphosphate linkage. The cap is added in reverse polarity (5′ to 5′), thus acting as a
barrier to 5’ exonuclease attack, but it also promotes splicing, transport and
translation. Caps contribute to the stability of mRNAs by protecting their 5′ ends
from phosphatases and nucleases. Thus, cap has following functions:
& It protects mRNA from ribonucleases.
& It promotes splicing.
& Cap binds to specific cap-binding complex of proteins and participates in
binding of mRNA to ribosome to initiate translation (i.e. help in recruitment of
ribosome to mRNA or recognition of mRNA by translational machinery).
& It increases efficiency of translation.
™ In eukaryotic mRNAs poly (A) tail is present at extreme 3’ end of mRNA. It is 80250 A residues long. It is added enzymatically by poly A polymerase. Functions of
poly (A) tail are:
& Contributes to efficient translation.
& Serves as binding sites for one or more specific proteins.
& Apparently plays a role in the processing or transport of mRNA from nucleus to
ribosome.
& Enhances the level of translation of mRNA by promoting efficient recycling of
ribosomes.
& Probably help protect mRNA from enzymic destruction.
30
& Many prokaryotic mRNAs also acquire poly (A) tails but these tails stimulate
decay of mRNA rather than protecting it from degradation.
(b)
Ribosomal RNA (rRNA)
™ These are components of ribosomes and hence the name.
™ These constitute 80% of total RNA.
™ They represent 40-60% of total weight of ribosome.
™ They are the most stable form of RNA.
™ The rRNAs function in ribosome assembly along with proteins. However, the rRNAs
are not simply the structural components of ribosome. Rather they are directly
responsible for the key function of ribosome. The 16S rRNA contains specific
pyrimidine rich sequence (a subset of AGGAGG) at the 3′ end that is complementary to
the purine rich Shine-Dalgarno (SD) sequence at the 5′ end of mRNA and thus helps in
binding to mRNA during translation. The rRNA plays a central role in the function of
small subunit of ribosome.
™ The rRNAs are transcribed as large transcripts from DNA.
™ A few of the bases in rRNA are methylated.
™ The rRNA from all sources has G:C content more than 50%. The rRNA molecule
appears as a single unbranched polynucleotide strand (primary structure). At low ionic
strength, the molecule shows a compact rod with random coiling. But at high ionic
strength, the molecule reveals the presence of compact helical regions with
complementary base pairing and looped outer region (secondary structure). The double
helical structure can form within a single RNA molecule or between two separate RNA
molecules. RNAs can often assume even more complex shapes as in bacteria.
(i)
Prokaryotic rRNA
™ In E. coli cells, rRNA occur as linear, single stranded molecules that appear in three
characteristic forms with different sedimentation coefficient. These are 23S, 16S and
5S. These are transcribed as single pre-rRNA transcript (Fig. 17). These three forms
differ in base ratios and sequences.
Fig. 17: Pre-rRNA transcript in prokaryotes (30S) (~6500 nt)
™ The rRNAs function in ribosome assembly along with proteins. In prokaryotes, 23S
and 5S rRNAs form the part of large (50S) ribosomal subunit, while the 16S rRNA
forms the part of small (30S) subunit. The constitution of ribosomal subunits is
shown in Fig. 18. The rRNA also plays a central role in the function of both the
subunits of ribosome. The anticodon loops of charged tRNAs and the codons of
mRNA contact the 16S rRNA, not the ribosomal proteins of small subunit. 23S
rRNA plays crucial role in transpeptidase reaction during translation.
31
Fig. 18: rRNA as constituents of prokaryotic ribosomes
(ii) Eukaryotic rRNA
™ On basis of sedimentation coefficient there are four types of rRNAs in eukaryotes 28S, 18S, 5.8S and 5S. The rRNAs are transcribed as large transcripts from
DNA. Thus, 28S, 18S and 5.8S rRNAs are transcribed as single pre-rRNA
transcript, while 5S rRNA is synthesized separately (Fig. 19). Pre-rRNA
transcription units are arranged in clusters in the genome as long tandem arrays
separated by non-transcribed spacer sequences. The arrays of rRNA genes loop
together to form the nucleolus and are known as nucleolar organizer regions.
Each rRNA gene produces a 45S rRNA transcript called pretranscript or
preribosomal RNA or pre-rRNA, which is ~13000 nucleotide long. The 45S
pretranscript is processed in nucleolus to give one copy each of 28, 18, 5.8S
rRNAs, which are 5000, 2000 and 160 nucleotides long respectively. The genes
for 5S rRNA are organized in a tandem gene cluster. This is the only rRNA
subunit to be transcribed separately.
Fig. 19: Pre-rRNA transcript in eukaryotes (45S) (~13000 nt)
™ The rRNAs function in ribosome assembly along with proteins. In eukaryotes, 28S,
5S and 5.8S rRNAs are present in large (60S) subunit of ribosome, while 18S rRNA
is present in small subunit of ribosome (Fig. 20).
(c)
™
Transfer RNA (tRNA)
Transfer RNA serves as adapter molecule in translating the language of nucleic
acids in mRNA into the language of proteins, by serving as carriers of specific
amino acids to specific sites on protein-synthesizing template i.e. ribosome. The
tRNAs, covalently linked to an amino acid at one end, pair with the mRNA in such a
way that amino acids are joined to a growing polypeptide in the correct sequence.
™
Each tRNA is specific of an amino acid, i.e., it can bind or accept only that
32
particular amino acid.
Fig. 20: rRNA as constituents of eukaryotic ribosomes
™
™
The tRNA contributes to 15% of total RNA.
The tRNA molecules remain dissolved in solution after centrifuging a broken cell
suspension at 100,000X gravity for several hours, hence also called soluble RNA.
™
Molecular weight of tRNAs range from 24000-31000 (2.5 x 104 to 3.1 x 104).
™
Sedimentation coefficient of tRNA is ~4S.
™
The base sequence of a tRNA molecule was first determined by Robert Holley in
1965. His study of yeast alanine tRNA (tRNAAla) provided the first complete
sequence of any nucleic acid.
™
These are 73-93 nucleotide long.
™
The conventional numbering of nucleotides begins at the 5’ end and proceeds
toward the 3’end.
™ The 5’ terminus is phosphorylated (pG) whereas the 3’ terminus has a free OH group.
™ There are more than one specific tRNA for each amino acid [5 for Leucine, 5 for Serine,
4 for Glycine, 4 for Lysine].
™ There are no tRNAs for Hyp and cysteine.
™ A striking feature is the presence of modified bases or minor bases, introduced by
enzymes that recognize target bases in tRNA structure. About 7-15 bases are modified
(modification can be methylation of A, G, C, T or presence of modified base
pseudouridine. Modifications of pyrimidines are less complex than those of purine. In
tRNA, there is a vast range of modifications, ranging from simple methylation to
wholesome restructuring of purine ring.
™ A striking feature of tRNA is its high content (up to 25%) of unusual bases other than A,
U, G and C. These include post-translationally modified or hypermodified bases. Nearly
80 such bases, found at >60 different tRNA positions, have been characterized. A few of
such minor or modified bases in tRNA are listed in Table 6.
™ In addition to the modifications of the bases themselves, methylation at the 2’-O
position of the ribose ring also occurs. Purpose of methylation / modification:
* Methylated bases do not form base pairs and become accessible for other
interactions (disallow unwanted base pairing with mRNA).
* Methylation provides hydrophobic character to some portions, which is important
for their interaction with synthetases and ribosomal protein.
* Unusual bases provide stability, protect from hydrolytic attack by nucleases.
™
Codon-anticodon recognition involves wobbling at the first position of the
anticodon (third position of codon), which allows some tRNAs to recognize
multiple codons. Wobble base is less specific in its interaction with its
33
corresponding base in codon than other two bases. This wobbling also allows
some tRNAs to recognize multiple codons. Wobbling also allows easy release of
tRNA once an amino acid has been added.
Table 6: Examples of some minor or modified bases in tRNA and their standard
abbreviations
5, 6-Dihydrouridine (D or hU or UH2 or DHU) Pseudouridine (ψ)
Ribothymidine (T)
1-methyl guanosine (m1G)
1-methyl adenosine (m1A)
Inosine (I)
1
1-methyl inosine (m I)
N2, N2-Dimethyl guanosine (m22 G)
N6-isopentenyl adenosine (i6A)
N7-methylguanosine (m7G)
3-methylcytidine (m3C)
4-Thiouridine (s4U)
2-Thiouridine (s2U)
N4-Acetlycytidine (ac4C)
Lysidine (L)
* Quenosine (Q-base)
** Wyosine (Wyo; Y-base)
* Quenosine or Q-base: Pentenyl ring at methyl group of 7-methylguanosine
** Wyosine or Y base: Additional ring fused with purine ring itself. Extra ring carries a long C chain,
again to which further groups are added in different cases.
™ Each amino acid recognized by particular aminoacyl tRNA synthetases, which also
recognizes all of the tRNAs coding for that amino acid.
™ tRNAs are derived from longer RNA precursors by enzymatic removal of nucleotides
from 5’ and 3’ ends. Where two or more different tRNAs are contained in a single
primary transcript, they are separated by enzymatic cleavage. The endonuclease RNaseP
found in all organisms, removes RNA at the 5’ end of tRNA. This enzyme contains both
protein and RNA. It is an example of catalytic RNA. The 3’ end of the tRNA is
processed by one or more nucleases, including the exonuclease RNaseD. In eukaryotes,
introns are present in a few tRNA transcripts and must be excised.
™ As the function of tRNA is to bind the specific amino acids, one might think that there
are 20 types of tRNAs. Since the code is degenerate (i.e. there is more than one codon
for an amino acid), there may also be more than one tRNA for a specific amino acid. In
fact, their total number far exceeds than 20. In a bacterial cell, there are more than 70
tRNAs and in eukaryotic cell, this number is even greater, because there are tRNAs
specific of mitochondria and chloroplast (which usually differ from the corresponding
cytoplasmic tRNAs). Eukaryotic cells have multiple copies of many of the tRNA genes.
Therefore, there are generally several tRNAs specific of the same amino acid
(sometimes up to 4 or 5); they are called isoacceptor tRNAs. These various tRNAs,
capable of binding the same amino acid, differ in their nucleotide sequence; they can
either have the same anticodon and therefore recognize the same codon or have different
anticodons and thus permit the incorporation of the amino acid in response to multiple
codons specifying the same amino acid.
™ Secondary structure: When drawn in two dimensions, the secondary structure of tRNA
resembles a clover leaf structure (Fig. 21).
34
Fig. 21: Clover leaf secondary structure of tRNA
In tRNA, ~50% of the bases are paired forming 4 arms with three loops. Longer tRNAs
have a short fifth extra arm of variable length. These arms act as recognition sites. These
are:
&
Amino acid attachment site (3’-OH, 5’-pG)
&
Anticodon arm (-Py-Py-X-Y-Z-Pu-N-)
&
DHU arm
&
TψC arm
&
Variable extra arm (3-21 bases)
The common features of secondary structure of tRNA are:
Ê Acceptor or amino acid stem: A 7 bp stem that includes the 5’ terminal nucleotide
and that may contain non-Watson-Crick base pairs such as G:U. This assembly is
known as acceptor or amino acid stem because the amino acid residue carried by the
tRNA is appended to its 3’ terminal OH group. 5’ end has terminal G residue, which
is phosphorylated. 3’ end has CCA residue at terminal region with a free 3’-OH
group. This forms amino acid arm for recognition of particular amino acid. The
amino acid attachment site is the 3’-OH group of the adenosine residue at the 3’
terminus of the molecule via the 3’ group of its ribose. Amino acid arm can carry a
specific amino acid esterified by its carboxyl group to the 2’ or 3’ hydroxyl group of
‘A’ residue (at 3’ end). The amino acid residue is enzymatically transferred to the
end of growing polypeptide chain on surface of ribosome during protein synthesis.
Ê Anticodon arm: Just opposite the amino acid arm, is a 5 bp stem ending in a loop,
which contains an anticodon (a sequence of three bases complementary to three base
codon sequence in mRNA). The anticodon forms H-bond with complementary base
in mRNA attached to a ribosome. Anticodon loop contains 7 bases with a sequence:
5’-Py-Py-X-Y-Z-modified Pu-N-3’, where Py is any pyrimidine, Pu is any purine, N
is any base, X, Y, Z signifies anticodon complementary to codon of mRNA. This
35
Ê
Ê
Ê
Ê
Ê
Ê
Ê
Ê
arm helps in selection and positioning of correct amino acid for transfer to growing
polypeptide chain. A modification to Watson-Crick base pairs are the Wobble pairs,
which allow bases in 5’ anticodon position of tRNA to pair ambiguously with
mRNA i.e. the Wobble base is less specific in its interaction with its corresponding
base in codon than other two bases. The Wobble base pairs are formed because
bases are offset from their normal Watson-Crick and one of the H-bonds is lost.
Wobble pairs are thus represented by first base of anticodon of tRNA and last base
of codon of mRNA. In anticodon loop, at 3’ end of the anticodon is a purine or
pyrimidine derivative. Some tRNAs, particularly those of plants contain an
isopentenyl derivative of purine. These characteristic bases apparently serve as
‘stoppers’ to demarcate the anticodon. Thiacanthine [6-(3-methyl but-2-enylamino)
purine] is one of the minor purines found next to anticodon. This compound is a
cytokinin, a plant hormone. The predictions of Wobble pairing accord very well
with the observed abilities of almost all tRNAs. But there are expectations in which
the codons recognized by a tRNA differ from those predicted by the Wobble rules.
Such effects probably result from the influence of neighbouring bases and / or the
conformation of the anticodon loop in the overall tertiary structure.
D arm or DHU arm: It is a 3 or 4 bp stem ending in a loop and contains 2 to 3
unusual nucleotide, 5,6-dihydrouridine (DHU). The length of the D loop of D arm
varies from 5 to 7 nucleotides depending on tRNA.
T arm or TψC arm or Ribothymidine-Pseudouracil-Cytosine arm: It is a 5 bp
stem ending in a loop. It contains ribothymidine (rT), not usually present in RNA
and pseudouridine (ψ), which has an unusual C-C bond between the base and ribose.
Extra arm (variable arm): This is the site of greatest variability. It is located
between the anticodon loop and TψC loop. It has from 3 to 21 nucleotides and may
have a stem consisting of up to 7 bp.
DHU and TψC arms contribute important interactions for the overall folding of
tRNA molecules and the TψC arm interacts with the large subunit rRNA.
The length (distance from CCA end to anticodon site) is constant. X-ray analysis of
crystals of tRNA shows molecule is asymmetrically folded to yield a compact
structure about 9 nm long and 2.5 nm thick. Extra bases in longer molecules are
adjusted in extra arm or DHU arm.
There are 15 invariant positions in the loop regions of all tRNAs, which always have
the same base and 8 semi-invariant positions, which have only a purine or a
pyrimidine base.
Each tRNA must have at least two such recognition sites: one for the activated
amino acid-enzyme complex with which it must react to form the aminoacyl-tRNA
and another for the site on a mRNA molecule which contains the codon for that
particular amino acid. The former involves recognition by bases of amino acid
residues (either of activated amino acid or of a site on enzyme molecule) whereas
the latter involves recognition of base by bases (H-bonding).
Some unusual base pairing patterns in tRNA: G:U base pairs are common in
RNA duplex structures. In codon-anticodon stable contact, this G:U pairs can
contribute only in the last position of codon. Codon-anticodon pairing involves
wobbling at the third position. The most direct effect of modification is seen in the
anticodon where change of sequence influences the ability to pair with the codon,
thus determining the meaning of tRNA. Modifications elsewhere in the vicinity of
the anticodon also influences its pairing. Where bases in the anticodon are modified,
further pairing patterns become possible in addition to those predicted by the regular
and Wobble pairing involving A, C, U and G. The G:U base pairs enhances capacity
36
of self complementarity. Inosine (I) is often present at the first position of anticodon.
Inosine can pair with any one of three bases, U, C and A, but not with G. This ability
is especially important in the Ile codons, where AUA codes for Ile, while AUG
codes for Met. Because with the usual bases, it is not possible to recognize ‘A’ alone
in the third position, any tRNA with U starting its anticodon would have to
recognize AUG as well as AUA. So AUA must be read together with AUU and
AUC, a problem that is solved by the existence of tRNA with inosine in the
anticodon. 4-Thiouracil base pairs only with A. Quenosine are modified G bases.
These modified G bases continue to recognize both C and U, but pair with U more
readily.
™
Tertiary structure
& The tertiary structure of tRNA was described by Alexander Rich and Aaron Klug in
1960s and was found to be of twisted L shaped (Fig. 22).
Fig. 22: 3-D Tertiary (L-shaped) structure of tRNA
& There are two segments of double helix. They are like A-DNA, as expected for an
RNA duplex. Each of these helices contains about 10 base pairs, which corresponds
to one turn of the helix. These helical segments are perpendicular to each other,
giving the molecule its L-shape. Most of the bases in the non-helical regions
participate in unusual H-bonding interactions. These tertiary interactions are
between bases that are not usually complementary (eg. GG, AA and AC). Moreover,
the ribose-phosphate backbone interacts with some bases and even with another
region of the backbone itself. The 2’-OH groups of ribose units act as H-bond
donors or acceptors in many of these interactions. In addition, most bases are
stacked. These hydrophobic interactions between adjacent aromatic rings play a
major role in stabilizing the architecture of the molecule.
& The amino acid (or amino acid acceptor) arm and TψC arm form a continuous
double helix and the anticodon (AC) arm and DHU arm form the other partially
continuous double helix. The two helical columns meet to form a twisted L-shaped
molecule.
& The CCA terminus and the adjacent helical region do not interact strongly with the
rest of the molecule. This part of the molecule may change conformation during
amino acid activation and also during protein synthesis on the ribosome.
37
& The acceptor stem and the stem of the ψC loop form an extended helix in the final
tRNA structure. Similarly, the anticodon stem and the stem of the D-loop form a
second extended helix. These two extended helices align at a right angle to each
other, with the D-loop and the ψC loop coming together. In the final stage, the two
extended helices adopt their proper helical configuration. These structures reveal
that base stacking plays a major role in RNA conformation, for example 72 out of
the 76 bases in tRNA are involved in stacking interactions. As in the DNA double
helix structure, stacking of RNA bases on top of one another is energetically
favourable. For this reason, short base paired, helical regions of RNA stack on top of
one another to form longer, discontinuous helical regions. These regions of stacked
helices then pack against each other via additional tertiary interactions.
& Four kinds of interactions stabilize the twisted L-shaped structure:
& By forming base triples: The first stabilizing interactions are H-bonds between
bases in different helical regions that are brought near each other in 3-D space by
the tertiary structure. These are generally unconventional (non Watson-Crick)
bonding. Such type of bonding is also called Hoogsteen base pairing.
& By base backbone interactions: The second stabilizing interactions are the
interactions between the bases and the sugar phosphate backbone.
& By base stacking: The third kind of stabilizing interaction is the additional base
stacking gained from formation of the two extended regions of base pairing.
&
Action of 2’-OH of ribose: The presence of 2’-OH in RNA backbone prevents
RNA from adopting B-form helix.
(C)
Heterogeneous nuclear RNA (hnRNA)
It comprises transcripts of nuclear genes made by RNA polymerase. It has wide size
distribution and low stability. In mammalian cells including those of human beings, a
precursor RNA is first synthesized in the nucleoplasm by DNA dependent RNA
polymerase. This precursor is then degraded by nuclear nuclease to mRNA that is then
translocated to cytoplasm where it becomes associated to ribosomal system. This precursor
RNA constitutes the fourth class of RNA molecules and is designated as heterogeneous
nuclear RNA (hnRNA). The hnRNA molecules may have molecular weights exceeding 107
D whereas the mRNA molecules are generally smaller than 2 x 106 D. Most mammalian
mRNA molecules are 400-4000 nucleotides in length whereas an hnRNA molecule
possesses 5000-50000 nucleotides. Some uncertainty still exists concerning the precursorproduct relationship between hnRNA and mRNA, the former being 10-100 times longer
than the latter. Thus, the hnRNA molecules appear to be processes to generate the mRNA
templates for protein synthesis.
Eukaryotes contain a vast majority of interrupted genes. Genes vary widely according to the
number and lengths of introns, but the typical mammalian gene has 7-8 exons spread out
over ~16 kb. The exons are relatively short (~100-200 bp) and the introns are relatively long
(>1 kb). The discrepancy between the interrupted organization of gene and uninterrupted
organization of its mRNA requires processing of the primary transcription products. The
primary transcript has the same organization as the gene and is sometimes called premRNA. Removal of introns from pre-mRNA leaves a typical messenger of ~2.2 kb. The
average size of hnRNA is much larger than mRNA, it is very unstable and has a much
greater sequence complexity. Taking its name from its broad size distribution, it was called
hnRNA. It includes pre-mRNA but could also include other transcripts.
38
The physical form of hnRNA is a ribonucleoprotein particle (hnRNP) in which the hnRNA
is bound by proteins. As characterized in vitro, an hnRNP particle takes the form of beads
connected by a fiber. The hnRNP is organized in 40S particles. The most abundant proteins
in the particle are core proteins, but other proteins are present at lower stoichiometry,
making a total of ~20 proteins. The proteins typically are present at ~108 copies per nucleus,
compared with ~106 molecules of hnRNA. Some of the proteins may have a structural role
in packaging the hnRNA, several are known to shuttle between the nucleus and cytoplasm
and play roles in exporting the RNA or otherwise controlling its activity.
Suggested Reading
1.
Berg J.M., Tymoczko J.L., Stryer L., Biochemistry, International Edition, V Edition, W.H. Freeman &
Co. New York.
2. Watson J.D., Baker T.A., Bell S.P., Gann A., Levine M., Losick R., Molecular Biology of the Gene, V
Edition, Pearson Education.
3. Lewin B., Genes VIII, International Edition, Pearson Education International.
4. Glick B.R., Pasternak J.J., Molecular Biotechnology Principles and Applications of Recombinant DNA,
III Edition, ASM Press.
5. Turner P.C., McLennan A.G., Bates A.D., White M.R.H., Instant Notes, Molecular Biology, II Edition,
Viva Books Pvt. Ltd.
6. Das H.K., Textbook of Biotechnology, Wiley Dreamtech.
7. Voet D., Voet J.G., Biochemistry, John Wiley & Sons.
8. Nelson D.L., Cox M.M., Lehninger Principles of Biochemistry, IV Edition, W.H. Freeman & Co., New
York.
9. Twymann R.M., Advanced Molecular Biology, Viva Books Pvt. Ltd.
10. Brown T.A., Genomes 2, Wiley Liss Publ.
39