Download Mutations to nonsense codons in human genetic

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

RNA-Seq wikipedia , lookup

Genetic engineering wikipedia , lookup

Personalized medicine wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Gene therapy wikipedia , lookup

Gene wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Biosynthesis wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Mutation wikipedia , lookup

Point mutation wikipedia , lookup

Genetic code wikipedia , lookup

Transcript
©1994 Oxford University Press
Nucleic Acids Research, 1994, Vol. 22, No. 8
1327-1334
Mutations to nonsense codons in human genetic disease:
implications for gene therapy by nonsense suppressor
tRNAs
Jennifer Atkinson and Robin Martin*
Krebs Institute for Biomolecular Research, The University of Sheffield, PO Box 594, Firth Court,
Western Bank, Sheffield S10 2UH, UK
Received January 28, 1994; Revised and Accepted March 7, 1994
ABSTRACT
Nonsense suppressor tRNAs have been suggested as
potential agents for human somatic gene therapy.
Recent work from this laboratory has described
significant effects of 3' codon context on the efficiency
of human nonsense suppressors. A rapid Increase in
the number of reports of human diseases caused by
nonsense codons, prompted us to determine how the
spectrum of mutation to either UAG, UAA or UGA
codons and their respective 3' contexts, might effect
the efficiency of human suppressor tRNAs employed
for purposes of gene therapy. This paper presents a
survey of 179 events of mutations to nonsense codons
which cause human germline or somatic disease. The
analysis revealed a ratio of approximately 1:2:3 for
mutation to UAA, UAG and UGA respectively. This
pattern is similar, but not identical, to that of naturally
occurring stop codons. The 3' contexts of new
mutations to stop were also analysed. Once again, the
pattern was similar to the contexts surrounding natural
termination signals. These results Imply there will be
little difference In the sensitivity of nonsense mutations
and natural stop codons to suppression by nonsense
suppressor tRNAs. Analysis of the codons altered by
nonsense mutations suggests that efforts to design
human UAG suppressor tRNAs charged with Trp, Gin,
and Glu; UAA suppressors charged with Gin and Glu,
and UGA suppressors which Insert Arg, would be an
essential step In the development of suppressor tRNAs
as agents of human somatic gene therapy.
INTRODUCTION
Nonsense mutations cause the premature termination of protein
synthesis, since in the normal course of translation, there are no
aminoacyl-tRNAs whose anticodons match the UAG, UAA or
UGA nonsense codons. Nonsense suppressors can be created
however, by mutating the tRNA so that the suppressor is able
to match one of the termination signals. A proportion of full
length gene product is now produced. In 1982, Y. W. Kan and
*To whom correspondence should be addressed
colleagues published a paper in Nature reporting the construction
of a human nonsense suppressor tRNA and the successful in vitro
suppression of a UAG mutation at codon 17 of the /S-globin gene
(1). The mRNA containing the nonsense mutation was obtained
from a patient suffering from |S0 thalassemia and it was suggested
that nonsense suppression might one day prove to be a useful
technique for the somatic gene therapy of human diseases caused
by mutation to nonsense codons (1).
Although there has been relatively little work in this area in
the intervening years, there are several attractive aspects to such
a strategy. First, tRNA genes have strong promoters, which are
active in all cell types. The promoters for eukaryotic tRNA genes
lie within the structural sequences encoding the tRNA molecule
itself (2). Although there are elements which regulate
transcriptional activity within the 5' upstream region (3), the
length of an active transcriptional unit may be considerably less
than 500 base pairs, and thus accommodation within a delivery
vector presents no problem. Secondly, once they have been
transcribed and processed, tRNAs have low rates of degradation.
Finally, gene therapy with a nonsense suppressor would maintain
the endogenous, physiological controls over the target gene which
contains the nonsense codon. On the down side, nonsense
suppressors may cause readthrough of natural stop codons. In
addition, the presence of nonsense mutations can lead to the
aberrant splicing of introns, and to reduced levels of complete
mRNA (4,5). As these events are both nuclear in location, they
are probably beyond the reach of cytoplasmic suppressors. Of
course, only a fraction of mutations leading to human genetic
disease are caused by nonsense mutations. However, if an
effective mechanism for gene therapy by nonsense suppression
could one day be developed, it would then be applicable to similar
mutations in a wide range of genes.
One aspect which was not considered in the in vitro experiments
(1) was the context sensitivity of the efficiency of nonsense
suppression. Recently, we have described the way in which the
3' codon context affects the efficiency of UAG suppressor tRNAs
in human tissue culture cells (6,7). In general, the efficiency of
suppression varies according to the immediate 3' base in the
pattern: C > G > U > A , although it is probable that there are
effects of the next 3' base as well. The efficiency of nonsense
1328 Nucleic Acids Research, 1994, Vol. 22, No. 8
Table 1. Nonsense mutations in human genes resulting in genetic disease.
5'codon
Affected
codon
3'codon
Stop
codon
Site
Gene or disease
CTG(leu)
AGG(arg)
AGA(arg)
CTGOeu)
GCA(ala)
AAAOys)
AAGflys)
GCT(ala)
AAGOys)
GGC(gly)
TGC(cys)
GTG(val)
AGG(arg)
TTC(phe)
AAA(lys)
ACT(thr)
ACA(thr)
CTGOeu)
CTGOeu)
GCC(ala)
CTG(leu)
ACC(thr)
GGC(gly)
GTC(val)
AAAOys)
TTT(phe)
GTG(val)
AGT(ser)
TTC(phe)
CTTOeu)
CAA(gln)
CAG(gln)
ATA(Ue)
TCT(ser)
AGC(ser)
ACA(thr)
ATG(met)
GCA(ala)
GAG(glu)
AAG(lys)
GTC(ala)
CCA(pro)
TGGOeu)
TAT(tyr)
CTAOeu)
GTC(val)
AAC(asn)
TTGOeu)
AAT(asn)
ACA(thr)
ATT(Ue)
GTA(val)
TAT(tyr)
GCT(ala)
GCT(ala)
CTTOeu)
GTT(val)
GCA(ala)
TTT(phe)
TTT(phe)
AAG(lys)
GAT(asp)
TGT(cys)
TGT(cys)
TGG(trp)
AGA(arg)
TAT(tyr)
ACC(thr)
GGT(gly)
CCT(pro)
TTGOeu)
CAA(gln)
CAA(gln)
CAG(gln)
GAA(glu)
CAA(gln)
TGG(trp)
GAA(glu)
TTAOeu)
CGA(arg)
CGA(arg)
AAGOys)
CAG(gln)
CGA(arg)
CAG(gln)
TCA(ser)
CGA(arg)
TAC(tyr)
TAC(tyr)
TGG(trp)
TGG(trp)
CAG(gln)
AAGOys)
TGT(cys)
GAA(glu)
GAG(glu)
CAG(gln)
GAG(glu)
CAA(gln)
GGA(gly)
CGA(arg)
TGG(trp)
TGG(trp)
CAG(gln)
CGA(arg)
TGG(trp)
CGA(arg)
TGC(cys)
CGA(arg)
GAA(glu)
GAG(glu)
TGG(trp)
CGA(arg)
TGG(trp)
CGA(arg)
CGA(arg)
CGA(arg)
CGA(arg)
CAG(gln)
CGA(arg)
CGA(arg)
CGA(arg)
CGA(arg)
CGA(arg)
CGA(arg)
CGA(arg)
CAA(gln)
CGA(arg)
GAA(glu)
TGG(trp)
CAG(gln)
CAG(gln)
GAG(glu)
TGG(trp)
TGT(cys)
TGC(cys)
CGA(arg)
CAA(gln)
CAA(gln)
TGG(trp)
AGTOeu)
AGT(ser)
TCA(ser)
GGT(gly)
ATA(Ue)
ATT(ile)
GCC(ala)
CTGOeu)
GTA(val)
ATC(ile)
CTCOeu)
GTG(val)
GAG(glu)
GACHglu)
CAT(rus)
TCA(ser)
CTCOeu)
GAG(glu)
GAG(glu)
GGC(gly)
GCC(ala)
AGG(arg)
GTG(val)
GTG(val)
TTC(phe)
TCC(ser)
GCA/U(ala)
CTGOeu)
GAG(glu)
GAA(glu)
GCA(ala)
AGG(arg)
AAAOys)
TTT(phe)
GTC(val)
AAC(asn)
TCT(ser)
CAA(gln)
GAA(glu)
CTTOeu)
AA
ACA(thr)
TTC(phe)
CAT(his)
ATG(met)
TTT(phe)
AGC(ser)
CACKgln)
AGC(ser)
CAC(his)
TGG(trp)
AAAOys)
GGA(gly)
TAC(tyr)
CTTOeu)
ATT(ile)
GGG(gly)
GAA(glu)
AAC(asn)
AAGflys)
TAT(tyr)
TGT(cys)
TCC(ser)
TGT(cys)
CCC(pro)
GAG(glu)
CTTOeu)
TCA(ser)
TCC(phe)
CACKgln)
TAG
TAA
TAA
TAG
TAA
TAA
TGA
TAA
TAA
TGA
TGA
TAG
TAG
TGA
TAG
TAA
TGA
TAG
TAA
TAG
TGA
TAG
TAG
TGA
TAA
TAG
TAG
TAG
TAA
TGA
TGA
TGA
TAG
TAG
TGA
TAG
TGA
TGA
TGA
TAA
TAG
TAG
TGA
TGA
TGA
TGA
TGA
TGA
TAG
TGA
TGA
TGA
TGA
TGA
TGA
TGA
TAA
TGA
TAA
TAG
TAG
TAG
TAG
TGA
TGA
TGA
TGA
TAA
TAA
TAG
L261X
Q1O41X
Q1067X
Q1338X
E13O6X
Q12X
W717X
E358X
L14OX
R197X
R129X
K217X
Q84X
R2486X
Q145OX
S375OX
R19X
Y37X
Y37X
W210X
W98X
Q39X
K17X
C112X
E121X
E43X
Q127X
E90X
Q3O9X
G542X
R553X
W1282X
W1316X
Q493X
R1162X
W846X
R1158X
C524X
nt2510
nt3714
nt2522
nt6002
R-5X
W225X
R336X
R427X
R583X
R795X
Q1686X
R1696X
R1941X
R1966X
R2116X
R2147X
R2209X
R2307X
nt6406
nt6460
nt6472
nt6688
nt6693
ntlO4OO
ntl0406
ntlO468
ntlO471
nt 17700
ntl7761
nt20497
nt20551
nt20561
Acid Spingomyelinase
Adenomatous polyposis coli(APC)
APC-gastric cancer
APC
APC
AMP deaminase
Androgen receptor
Anti-mullerian, Hormone
Antithrombinin
AntithrombiruTI
Antithrombinlll
a 1 -antitrypsin(emphysema)
Apolipoprotein A-l
Apohpoprotein B
Apolipoprotein B
Apolipoprotein B
Apolipoprotein C-n
Apolipoprotein CII
Apolipoprotein CD
Apolipoprotein E
APRT deficiency
Beta-globin(/3-thalassemia)
Beta-globinOS-thalassemia)
Beta-globin(j3-thalassemia)
Beta-globin(£-thalassemia)
Beta-globinOS-thalassemia)
Beta-globin03-thalassemia)
Beta-globin(j3-thalassemia)
Cholesteryl ester transfer protein
Cystic fibrosis
Cystic fibrosis
Cystic fibrosis
Cystic fibrosis
Cystic fibrosis
Cystic fibrosis
Cystic fibrosis
Cystic fibrosis
Cystic fibrosis
Dystrophin-DMD
Dystrophin-DMD
Dystrophin-DMD
Erythropoietin receptor(EPOR)
Factor VIIl(HaemA)
Factor VHI(HeamA)
Factor Vm(HaemA)
Factor Vm(HaemA)
Factor Vm(HaemA)
Factor VrH(HaemA)
Factor Vm(HacmA)
Factor VHI(HaemA)
Factor Vm(HaemA)
Factor Vm(HaemA)
Factor VIU(HaemA)
Factor Vm(HaemA)
Factor VHI(HaemA)
Factor VTH(HaemA)
Factor IX(HaemB)
Factor IX(HaemB)
Factor KO^aemB)
Factor IX(HaemB)
Factor EX(HaemB)
Factor EX(HaemB)
Factor IXOiaemB)
Factor IX(HaemB)
Factor IX(HaemB)
Factor EX(HaemB)
Factor IX(HaemB)
Factor IX(HaemB)
Factor IX(HaemB)
Factor IX(HaemB)
Nucleic Acids Research, 1994, Vol. 22, No. 8 1329
CCT(pro)
TGG(trp)
TGT(cys)
AAT(asn)
AAA(lys)
AAG(lys)
ATT(ile)
AAGflys)
GGC(gly)
GGC(gly)
CTT(leu)
CAG(gln)
GAC(asp)
ACA(thr)
CTT(leu)
GAT(asp)
GAA(gln)
TTC(phe)
AGC(ser)
GAA(glu)
AAC(asn)
GAA(glu)
TCA(ser)
AGT(ser)
GCC(ala)
TGG(trp)
CCA(pro)
TTG(leu)
GAT(asp)
CTGfleu)
GAC(asp)
AGA(arg)
CTCfleu)
TGG(trp)
CCC(pro)
GGG(gly)
ATC(ile)
CAG(gln)
CCA(pro)
GTG(val)
ATT(ile)
CTTGeu)
AAC(asn)
CTTOeu)
GTG(val)
TTC(phe)
CTGGeu)
GTC(val)
GGA(gly)
ATG(met)
AAAGys)
AGT(ser)
AAGGys)
CTT(leu)
CCC(pro)
TTA(leu)
CTTGeu)
GCT(ala)
GCT(ala)
CCT(pro)
GCC(ala)
TTT(phe)
TTT(phe)
CCC(pro)
TAT(tyr)
CTG(leu)
TGC(cys)
CAC(his)
GAG(glu)
TTC(phe)
GTG(val)
ATC(ile)
CCT(pro)
CTGGeu)
AAG(lys)
TGG(trp)
CAG(gln)
GGA(gly)
GAA(glu)
TGG(trp)
CGA(arg)
CGA(arg)
GAA(glu)
TAT(tyr)
TGG(trp)
CAG(gln)
TAC(tyr)
CGA(arg)
TGT(cys)
CGA(arg)
TCA(ser)
GGA(gly)
TTAGeu)
TGG(trp)
GAG(glu)
TGG(trp)
AAAGys)
CGA(arg)
TGG(trp)
TGC(cys)
GAA(glu)
GAA(glu)
GAA(glu)
CGA(arg)
CGA(arg)
GAG(glu)
TGG(trp)
CGA(arg)
CGA(arg)
TGG(trp)
TGG(tip)
GAA(glu)
TAC(tyr)
CAG<gln)
CGA(arg)
CGA(arg)
CGA(arg)
CAG(gln)
CGA(arg)
TAC(tyr)
CAG(gln)
TGC(cys)
TAC(tyr)
CACKgln)
TAT(tyr)
TGG(trp)
TGG<trp)
TCA(ser)
CGA(arg)
TAT(try)
TAC(tyr)
CGA(arg)
CGA(arg)
CGA(arg)
CAG(gln)
AAGGys)
TGC(cys)
CGA(arg)
CAG(gln)
GAG<gIu)
GGA(gly)
CAAfeln)
GAG(glu)
GAA(glu)
CGA(arg)
GAA(glu)
CGA(arg)
GAG(glu)
TGG(trp)
CAG(gln)
CAG(gln)
GTA(val)
GGC(gly)
AAAGys)
ATT(Ue)
AAT(asn)
ATT(Ue)
TCA(tyr)
GTA(val)
GGA(gly)
TAC(tyr)
CTTGeu)
GCC(ala)
CTTGeu)
TCT(ser)
TGT(cys)
GAT(asp)
ACT(thr)
GGT(gly)
TGT(cys)
ATT(ile)
ACA(thr)
CTT(val)
GAT(asp)
ACC(thr)
AAGGys)
AAC(asn)
CTGGeu)
GGGfely)
GAC(asp)
AGC(ser)
ACC(thr)
GGT(gly)
GACHglu)
CCT(pro)
AAT(asn)
AGG(arg)
GTC(val)
CCG(pro)
ATC(Ue)
GGA(gly)
G
AGT(ser)
GAG(glu)
CTTGeu)
TGC(cys)
CTCGeu)
AAC(asn)
GAT(asp)
GAG(glu)
AAGGys)
GTG(val)
GGC(gly)
GAA(glu)
AAT(asn)
CCT(pro)
GAG(glu)
GTG(val)
GTG(val)
CAT(his)
TCT(ser)
CAA(gln)
CAT(his)
CCA(pro)
CCG(pro)
CGA(arg)
CTGGeu)
CTGGeu)
GAG(glu)
GAG(glu)
GGA(gly)
GTG(val)
GTT(val)
GTT(val)
TCA(ser)
TGA
TAG
TGA
TAA
TAG
TGA
TGA
TAA
TAA
TGA
TAG
TAG
TGA
TGA
TGA
TGA
TGA
TGA
TGA
TAG
TGA
TAA
TGA
TGA
TGA
TAA
TAA
TAA
TGA
TGA
TAG
TGA
TGA
TGA
TGA
TAG
TAA
TAG
TAG
TGA
TGA
TGA
TAG
TGA
TAG
TAG
TGA
TAA
TGA
TAA
TGA
TAG
TGA
TGA
TAA
TAG
TGA
TGA
TGA
TAG
TAG
TGA
TGA
TAG
TAG
TGA
TAA
TAG
TAA
TGA
TAA
TGA
TAG
TGA
TAG
nt20562
nt2O363
nt30072
nt30090
nt3OO97
nt3O863
nt3O875
nt31OOl
nt31O39
nt31051
nt31O91
nt31096
nt31118
nt31129
nt31133
nt3120O
nt312O8
nt31257
nt31276
nt31283
M31342
nt31352
R185X
nt5574
C720X
E375X
E357X
E364X
R359X
R186X
E279X
W343X
R137X
R393X
W26X
W171X
exon2
Y64X
Q310X
R897X
R372X
R988X
Q672X
R1000X
Y167X
Q12X
C660X
Y83X
Q106X
Y61X
W382X
W64X
S447X
nt2746
Y209X
Y299X
R426X
R141X
R109X
Q192X
K120X
C135X
R213X
Q317X
E221X
Y226X
Q136X
E298X
E286X
R342X
E198X
R196X
E224X
W146X
Q195X
Factor IX(HaemB)
Factor IX(HaemB)
Factor IX(HaemB)
Factor !X(HaemB)
Factor EX(HaemB)
Factor IX(HaemB)
Factor IXG^aemB)
Factor IX^aemB)
Factor IXGlaemB)
Factor IX(HaemB)
Factor IX(HaemB)
Factor IX(HaemB)
Factor EX(HaeraB)
Factor IX(HaemB)
Factor IXG-IaemB)
Factor IXG-IaemB)
Factor IXG"IaemB)
Factor IX(HaemB)
Factor IX(HaemB)
Factor IX(HaemB)
Factor IX(HaemB)
Factor IX(HaemB)
Fanconi anemia-group C gene
Fibrillin gene{Marfan syndrome)
Fructose Intolerance-Aldolase B
a-L-Fucosidase(fucosidosis)
Fumarylacetoacetate hydrolase
Fumarylacetoacetate hydrolase
Glucocerebrosidase(Gaucber dis.)
Glucokinase-NID diabetes
Glucolcinase
Glycoprotein lb alpha
|3-hexosaminidase A-Tay Sachs
/S-hexosaminidase A-Tay Sachs
/3-hexosaminidase A-Tay Sachs
typell 3/3 hydroxysteroid dehydrog.
Hypothyroidism TSH B subunit gene
IDUA G^urler syndrome)
IDUA alpha-L-iduronidase
Insulin receptorGeprechaunism)
Insulin receptorGeprechaunism)
Insulin receptor(diabetes)
Insulin receptor Leprechaunism
Insulin receptor
LDL receptor(Hypercholesterolemia)
LDL receptor
LDL receptor(hypercholerterolemia)
Lecithin cholesterol acyltransferase
Lipoprotein lipase
Lipoprotein lipase
Lipoprotein lipase
Lipoprotein lipase
Lipoprotein lipase
OCRL-1 oculocerebrorenal synd. Lowe
Omithine aminotransferase
Omithine aminotransferase
Omithine aminotransferase
Omithine transcarbamylase
Omithine transcarbamylase
p53 squamous cell carcinoma
p53 Li Fraumeni syndrome
p53 Hepatocellular carcinoma
p53 Ovarian carcinoma, gastric tumour
p53 Esophageal carcinoma
p53 Osteocarcinoma
p53 Ovarian carcinoma
p53 Esophageal carcinoma
p53 Hepatocellular carcinoma
p53 Esophageal carcinoma
p53 Breast cancer
p53 Hepatocellular carcinoma
p53 Fibrous histiocytoma
p53 Ovarian carcinoma
p53 Esophageal carcinoma
p53 Esophageal carcinoma
1330 Nucleic Acids Research, 1994, Vol. 22, No. 8
Table 1. (continued).
5'codon
Affected
codon
3'codon
Stop
codon
Site
Gene or disease
GAG(glu)
TTC(phe)
TTA(leu)
TCA(ser)
CATfliis)
TAC(tyr)
CAG(gln)
CTC(leu)
CTT(leu)
GGC(gly)
TAT(tyr)
CGA(arg)
TCA(ser)
CGA(arg)
GGA(gly)
TGG(trp)
TAC(tyr)
CGA(arg)
CGA(arg)
TGCKtrp)
TAT(tyr)
TGG(trp)
CGA(arg)
GAG(glu)
CAG(gln)
TGG(trp)
CAG(gln)
CGA(arg)
TCA(ser)
CGA(arg)
CGA(arg)
TGG(trp)
TGG(trp)
CAG(gln)
TAC(tyr)
CAG(gln)
CGA(arg)
CGA(arg)
CGA(arg)
CGA(arg)
TAT(ryr)
CGA(arg)
CGA(arg)
CGA(arg)
TTGOeu)
GTC(val)
GAG(glu)
GAT(asp)
TCC(ser)
TTT(phe)
TGC(cys)
CCT(pro)
GAT(asp)
CAC(his)
GAG(glu)
GGA(gly)
GGA(gly)
GTC(val)
CTG(lcu)
CCT(pro)
GAG(glu)
GGA(gly)
GTG(val)
GTG(val)
GCC(ala)
ATG(met)
GCA(ala)
ATG(met)
CGC(arg)
TTC(phe)
GTG(val)
GAG(glu)
AGG(arg)
AAGflys)
CTTfleu)
CAG(gln)
GCA(ala)
GAA(glu)
TAG
TGA
TGA
TGA
TGA
TAG
TAA
TGA
TGA
TAG
TAA
TAG
TGA
TAG
TAG
TAG
TAG
TGA
TGA
TGA
TGA
TAG
TGA
TAG
TAA
TAG
TGA
TGA
TGA
TGA
TAA
TGA
TGA
TGA
Y205X
R261X
S359X
R111X
Y272X
W326X
Y356X
R243X
R584X
W198X
Y145X
W29X
R732X
E249X
nt687
W406X
Q318X
R189X
S223X
R417X
R52X
W178X
W71X
Q119X
nt970
Q149X
R2535X
R1659X
ntlO84
p53 Ovarian carcinoma
Phenylalanine hydroxylase(PKU)
Phenylalanine hydroxylase(PKU)
Phenylalanine hydroxylase(PKU)
Phenylalanine hydroxylase(PKU)
Phenylalanine hydroxylase(PKU)
Phenylalanine hydroxylase(PKU)
Phenylalanine hydroxylase PKU
Platelet glycoproteinllb
Porphobilinogen deaminase
Pnon protein
Protein C (PROC)
Procollagen ll(COUAl)
Rhodopsin
SRY sex reversal
Steroid 21 hydroxylase
Steroid 21 hydroxylase
Triosephosphate isomerase-anemia
Tyrosine amino transferase
Tyrosine amino transferase
Tyrosine amino transferase
Tyrosinase (oculocutaneous albinism)
V2 receptor(X-linked NDI)
V2-Vasopressin receptor(diabetes)
Vitamin D receptor(rickets)
Vitamin D receptor(rickets)
Von Willebrand Factor
Von WUlebrand typelll
WT1-tumour suppressor-Wilms tumour
WT1-tumour supressor Zn fingcr3
XP-A-Xeroderma pigmentosa
XP-A-Xeroderma pigmentosa
XP-A-Xeroderma pigmentosa
XP-A-Xeroderma pigmentosa
?AC0
ACC(thr)
GCT(ala)
AAGflys)
AAG(lys)
TTC(phe)
CTGfleu)
CTC(leu)
GGG<gly)
ATC(ile)
ATC(ile)
GTC(val)
?AC0
CTGOeu)
AAGflys)
TGC(cys)
GTC(val)
CCC(pro)
GAA(glu)
CAG(gln)
TCT(ser)
GTC(val)
CGG(arg)
AAC(asn)
Y116X
R207X
R228X
R211X
Entries are sorted alphabetically according to the gene which has been mutated or the common name of the resulting disease. Where the 3' and 5' context are not
discernible from the paper describing the mutation they were determined from the published sequence or from the EMBL and Genbank databases held at Daresbury,
UK. Where the site of the mutation is known, this is indicated as either the number of the codon preceded by the altered amino acid (in single letter code), and
followed by X to indicate a terminator, or alternatively, as the nucleotide (nt) which has been mutated. This list can be supplied annotated with references, on request
to RM. by electronic mail or on receipt of an IBM type disc. The list in Table 1 is not exhaustive. Others have independently published, and are constantly updating,
a database of 880 single base pair substitutions which give rise to human genetic disease (14) A fraction of these will be mutations to stop codons. That database
does not however include information on the full 5' and 3' codon contexts.
suppression can vary by as much as an order of magnitude
between the most efficient and the least efficient 3' contexts
(Phillips-Jones, Hill, Atkinson and Martin: In Preparation). This
pattern of context effects in human cells is quite different to that
which operates in E.coli (6,8). There are also significant
differences in the efficiency of suppressors for either UAG, UAA
and UGA codons (9). The successful application of nonsense
suppressor tRNAs as agents for human gene therapy, might
therefore depend on both the proportions of UAG, UAA and
UGA codons, and the spectrum of 3' codon contexts, amongst
nonsense mutations that give rise to human genetic disease.
Moreover, the likelihood that suppressor tRNAs would give rise
to detrimental effects by reading through natural termination
codons, will be determined by the differential distribution of
nonsense codons favourable for suppression, between the
population of nonsense mutations, and the population of natural
stop codons.
Given the number of nonsense mutations which have been
described in human genes since the original proposal (1), we
believe it is now possible to review the pattern of mutations giving
rise to premature translation^ termination, with an eye to the
potential use of nonsense suppressors as agents of somatic gene
therapy. In this communication, we have surveyed the literature
for reports of point mutations which lead to nonsense codons in
human genes, and compared the distribution of the three
termination signals and their 3' contexts, with that of natural stop
codons.
RESULTS
The spectrum of mutations to nonsense codons in human
genetic disease
A total of 179 unique point mutations to nonsense codons were
identified in human genes from a search of literature reports in
a CD-ROM data base. Of these, 21 were either germ line or
somatic cell mutations in the tumour suppressor genes p53 and
APC. The mutational events we identified are listed in Table 1.
The affected codon and the encoded amino acid are given for
the site of the mutation, and it's 5' and 3' neighbours. Genes
are sorted alphabetically according to the most commonly used
name for either the gene product, or the genetic disease. This
list can be supplied, annotated with references, on request to RM,
by electronic mail or on receipt of an IBM type disc.
Nucleic Acids Research, 1994, Vol. 22, No. 8 1331
Table 2. The distribution of point mutations amongst codons with the potential to mutate to UAG. UAA
or UGA stop codons in human genetic disease.
Stop
Nucleotide
Affected codon
Number
Base change
C —T
TAG
1st position
AAG Lys
CAGGln
GAG Glu
TCGSer
TGGTrp
TTGLeu
TAC Tyr
TAT Tyr
3
23
9
0
13
1
6
1
A:T-T:A
C:G-T:A
G:C-T:A
*
1
A:T-T:A
C:G-T:A
G.C-T:A
C:G-A:T
T:A-A:T
C:G-A:T
T:A-A:T
2nd position
3rd position
TAA
1st position
2nd position
3rd position
TGA
1st position
2nd position
3rd position
AAA Lys
CAA Gin
GAA Glu
TCA Ser
TTA Leu
TAC Tyr
TAT Tyr
AGA Arg
CGA Arg
GGA Gly
TCA Ser
TTA Leu
TGCCys
TGGTrp
TGT Cys
G:C-A:T
T:A-A:T
C:G-G:C
T:A-G:C
10
14
1
1
3
5
0
55
5
C:G-T:A
G:C-T:A
C:G-G:C
T:A-G:C
C:G-A:T
G:C-A:T
T:A-A:T
4
1
5
15
3
•
•
•
*
Entries in Table 1 were scored for the codon affected and the base change involved in mutation to the
nonsense codon. Mutations arising from a C —T deamination are indicated by a *.
Figure 1 illustrates the frequency of mutations to the three
termination codons amongst the mutant alleles listed in Table 1:
UAG (31 %), UAA (18%) and UGA (51 % ) . Figure 1 also shows
the frequency of natural UAG, UAA and UGA codons used to
terminate protein synthesis at the ends of human genes. In human
cells, natural termination codon usage divides UAG (23%), UAA
(30%) and UGA (47%) (10-12). Whilst UGA codons are the
most frequent stop in both populations, the frequency of UAA
terminators is greater for natural stops than amongst new
mutations. The reverse is true for UAG. Overall, the two patterns
are significantly different: (x2 = 12.1, P = 0.002).
Table 2 shows the distribution amongst the possible base
changes at 1st, 2nd or 3rd codon positions which lead to the
creation of TAG, TAA and TGA mutations. TAG stops are
derived largely from CAG (Gin) and TGG (Trp) codons, TAA
mutations from CAA (Gin) and GAA (Glu), and TGA codons
originate predominantly from mutations in CGA (Arg) and TGG
(Trp). The C—T alteration far outweighs any other change which
is seen. This is particularly so for mutations to TGA, for which
the CGA (Arg) codon is especially susceptible. The reasons for
this are thought to be well understood (13,14). C ~ T transition
mutations are most likely caused by the spontaneous chemical
deamination of cytosine to give uracil. This leads to a U:G
mispair. U:G mispairs will become fixed as a C:G —T:A
mutation, if DNA replication precedes the detection and removal
of uracil by DNA uracil glycosylase. Where cytosine exists in
mammalian genomes as 5-methyl cytosine, in the doublet CpG,
cytosine deamination leads to a T:G mispair. The high rate of
mutation at these sites suggests that the T:G mispair is less readily
detected, or less faithfully repaired, than the U:G mispair.
Conversely, methylation of cytosine at the 5 position, may elevate
the rate of spontaneous deamination.
[_ ^nonsense mutations
^natural stop codons
60
40
20
1
1I
UAG
UAA
II
UGA
Figure 1. The frequency with which UAG, UAA and UGA termination codons
occur as human disease causing mutations compared with the frequency of UAG,
UAA and UGA as natural stop codons. The frequency of termination codons
produced by nonsense mutation was taken from Table 1. The frequency of naturally
occurring stop codons was taken from a sample of 1422 genes kindly supplied
by Paul Sharp and Andrew Lloyd.
The 3' codon context of mutations to nonsense codons in
human genetic disease
The distribution of 3' codon contexts amongst the 179 instances
of nonsense mutations is shown in Figure 2. The 3' codon context
found around natural termination codons is also displayed. The
pattern of 3' contexts amongst mutations to UAG and UAA are
not significantly different from the 3' bases flanking natural UAG
1332 Nucleic Acids Research, 1994, Vol. 22, No. 8
80
80
70
I
60
Y//\
I nonsense mutations
70 -
natural stop codons
60 -
50
50 -
40
40 -
30
30 -
20
20 -
10
10 -
0
I
I nonsense mutations
notural stop codons
0
A
C
G
U
A
C
G
U
UAA 3'context
UAG 3' context
80
I
I nonsense mutations
Y//\
natural stop codons
70
I
60
Y//X natural stop codons
I nonsense mutations
50
40
30
20
10
0
A
C
G
U
UGA 3'context
A
C
G
U
All stop codons 3' context
Figure 2. The 3' context of human disease causing nonsense mutations compared to the 3' context of natural stop codons. The 3' context of disease causing nonsense
mutations was taken from Table 1. The frequency of the 3' context of naturally occurring stop codons was calculated from a sample of 1422 genes kindly supplied
by Paul Sharp and Andrew Lloyd.
and UAA termination codons: (x2 = 7.2, P = 0.066, x2 =
0.072, P = 0.995 respectively). There is a significant difference
however between new mutations to UGA and natural stops: (x2
= 8.1, P = 0.043). There is a lower frequency of A, and a higher
representation of G 3' to natural UGA stop codons, than in new
mutations to UGA. There is no difference in the pattern of 3'
contexts between nonsense mutations and natural stops when
UAG, UAA and UGA are combined: (x2 = 3.6, P = 0.303).
DISCUSSION
We present in this paper a survey of mutations to nonsense codons
which give rise to human somatic cell and germ line diseases.
As early as 1982, it was suggested that gene therapy of this class
of disease loci might be attempted with human tRNA genes
mutated to recognise stop codons (1). Readthrough at the
nonsense mutation, by the suppressor, will restore a proportion
of wild type gene function. Given the rapid progress being made
in the identification of different nonsense mutations in human
genes, and recent findings on the determination of suppressor
efficiencies, it seems an appropriate moment to describe the
patterns of mutation which occur and relate these to the possibility
of suppressor tRNA gene therapy. In particular, experiments with
reporter gene constructs have revealed differences in the
effectiveness of suppressors according to which of the three
codons UAG, UAA or UGA is to be read, and also the contexts
in which these termination signals lie (6,7,9). This survey reveals
that nonsense mutations occur in an approximate ratio of 1:2:3,
for UAA, UAG and UGA respectively. Studies with human
nonsense suppressors (9) suggest that suppressor efficiency varies
UAG = UGA > UAA. The two most efficient suppressors can
therefore recognise some 80% of nonsense mutations which lead
to human genetic disease.
When a suppressor tRNA reads a stop codon, the amino acid
which is inserted is determined by the identity of the tRNA whose
anticodon was mutated to match the termination triplet. At some
sites, it might not matter which amino acid is inserted, so long
as as translation is restored for the full length of the gene. At
other sites, it might be important to restore authentic, wild type
gene product. In this case the suppressor has to insert the amino
acid corresponding to the codon in the unmutated gene. Our
analysis reveals that C:G—T:A transitions predominate in the
formation of stop codons. Trp, Gin and Glu codons are changed
most frequently to UAG; Glu and Gin codons are changed most
frequently to UAA; and overwhelmingly it is Arg and to a lesser
extent Trp codons which give rise to UGA. To be widely
applicable then, suppressor gene therapy would have to generate
efficient suppressors from Trp, Gin, Glu and Arg tRNAs. Studies
Nucleic Acids Research, 1994, Vol. 22, No. 8 1333
on the determination of tRNA 'identity elements', have shown
that those bases in a tRNA molecule which are responsible for
binding to the correct aminoacyl-tRNA synthetase enzyme,
sometimes lie in the anticodon loop (15). Thus, when nonsense
suppressors are created by mutagenesis of bases in this region,
the tRNA may be charged with a different amino acid. Upon
translation of a nonsense codon, this restores a normal length
protein, but one which contains an amino acid substitution. For
example, in E.coli UAG nonsense suppressors derived from
tRNA11? are charged with Gin as well as Trp (16). Rapid
advances are being made in this area. For bacterial tRNAs, it
is now largely known for which tRNAs mutation to a nonsense
suppressor gives rise to altered amino acid insertions (17).
Interestingly, site directed mutagenesis can been used to control
the extent of mischarging, and retain tRNA aminoacyl identity
(18). It should not be long before similar information is available
for human tRNAs. Research with bacterial tRNAs, has also
established that the strongest nonsense suppressors are formed
by altering the anticodon of tRNAs which normally read codons
beginning with U (19). Whilst it is anticipated that similar rules
will apply to human tRNAs, little work has been carried out on
this aspect.
Recent studies from this laboratory have established that the
3' codon context has a substantial effect on the efficiency of
human UAG suppressor tRNAs in human cells (6,7). It seems
likely that similar rules will apply to UGA codons (20). Our
researches have shown that UAG codons flanked by 3' A are
very inefficiently suppressed, whereas those followed by a 3'
C or G are suppressed some five to ten fold more efficiently for
a given concentration of tRNA. In prokaryote and lower
eukaryote organisms it is believed that the choice between the
three termination codons and their 3' codon contexts, is under
translational selection pressure (11,12,21,22). In contrast, we and
others, have argued that in mammalian cells, 3' termination codon
contexts are shaped by mutation, and not by selection for optimum
performance (23). This contention is reinforced by the present
study. Mutations to nonsense codons in human disease loci are
found in a similar range of 3' contexts to that observed for natural
stop codons. Nonsense mutations in the human genome are fairly
evenly divided between 3' contexts of A, C, G or U. In general,
3' G is most common and 3' U is least frequently observed. This
distribution of bases matches very well the distribution observed
3' to natural stop codons (23). These patterns are largely
determined by the local G+C content of the human genome,
which is known to consist of substantial blocks or 'isochores'
of sequences which differ widely in their richness for G+C
(24,25). Given that the proportions of UAG, UAA and UGA
are similar for new mutations and natural stop codons, the balance
of probabilities is that termination codon choice, is not subject
to translational selection in human cells either.
The findings of this study have important implications for
assessing the likelihood that suppressor tRNAs will be detrimental
to the physiology of the cell, if they cause readthrough at a
significant number of natural termination codons. C-terminal
extended species may be degraded prematurely, they may have
reduced enzyme activities, or they could display codominant,
negative properties in their interaction with other proteins. Even
short C-terminal extensions can have serious consequences for
some polypeptides. For example, mutations which eliminate the
natural stop codon of the a-globin gene give rise to a C-terminal
extension of 31 amino acids. This causes a severe, dominant form
of thalassemia (4). Of course, in the case of gene therapy by a
suppressor tRNA, the level of the tRNA could be adjusted so
that readthrough by at a natural stop codon may be as little as
5 —10%, if this concentration of suppressor proved sufficient to
reverse the mutant phenotype. Readthrough of this intensity at
natural termination codons, may not present so drastic an
outcome, in the presence of 90-95% of correctly terminated
polypeptide chains.
This review of nonsense mutations and natural stop codons,
suggests that both populations are similar in their proportions
of UAG, UAA and UGA, and in the distributions of their 3'
contexts. Where differences exist, these are in favour of
suppression therapy. UAG and UGA mutations account for 82%
of human mutations to stop, whereas UAG and UGA comprise
only 70% of natural termination codons. Contrary to some earlier
suggestions (26), natural stop codons in human cells do not seem
to be protected in any special way from translational readthrough
by their immediate 3' contexts. Studies have shown that there
is no significant evidence to support the widespread belief that
multiple stop codons are employed by cells to provide a fail-safe
mechanism for terminating protein synthesis (22,27). There are
indications from E.coli though, that the nature of the C-terminal
amino acids within the nascent polypeptide, can influence the
efficiency of translational termination (28,29). Moreover, surveys
of bacterial gene sequences have suggested preferences for certain
amino acids at the C-terminus, which could reflect on the
efficiency of stop decoding (11,30). If C-terminal amino acids
are selected to improve the efficiency of translational termination
in human cells, this could increase the specificity of nonsense
suppressors for stop mutations over natural termination codons.
However, this appears unlikely in the light of the studies which
show that the counterparts to bacterial preferences in mRNA
sequences relating to codon usage and 3' codon context effects,
are missing in human cells (23,31).
ACKNOWLEDGEMENTS
JA is the recipient of an MRC postgraduate studentship. RM is
supported by a Royal Society University Research Fellowship.
The Krebs Institute is a SERC centre for molecular recognition.
This work benefited from the use of the SEQUENET facility.
REFERENCES
1. Temple, G.F., Dozy, A.M., Roy, K.L. and Kan, Y.W. (1982) Nature, 296,
537-540.
2. Geiduschek, E.P. and Tocchini-Valentini, G.P. (1988) Ann. Rev. Biochem.,
57, 873-914.
3. Capone, J.P. (1988) DNA, 7, 459-468.
4. Cooper, D.N. (1993) Ann. Med., 25, 11-17.
5. Diaz, H.C., Valle, D., Francomano, C.A., Kendzior, R.J.Jr., Pyeritz, R.E.
and Cutting, G.R. (1993) Science, 259, 680-683.
6. Phillips-Jones, M.K., Watson, F.J. and Martin, R. (1993) J. Mol. Biol.
233, 1-6.
7. Martin, R., Phillips-Jones, M.K., Watson, F.J. and Hill, L.S.J. (1993)
Biochem. Soc. Trans., 21, 843-851.
8. Miller, J.H. and Albertini, A.M. (1983) J. Mol. Biol., 164, 5 9 - 7 1 .
9. Capone, J.P., Sedivy, J.M., Sharp, P.A. and RajBhandary, U.L. (1986)
Mol. Cell. Biol., 6, 3059-3067.
10. Brown, C M . , Dalphin, M.E., Stockwell, P.A. and Tate, W.P. (1993)
Nucleic Acids Res., 21, 3119-3123.
11. Brown, C M . , Stockwell, P.A., Trotman, C.N. and Tate, W.P. (1990)
Nucleic Acids Res., 18, 6339-6345.
12. Cavener, D.R. and Ray, S.C. (1991) Nucleic Acids Res., 19,3185-3192.
13. Youssoufian, H., Kazazian, H.H.Jr., Phillips, D.G., Aronis, S., Tsiftis,
G., Brown, V.A. and Antonarakis, S.E. (1986) Nature, 324, 380-382.
1334 Nucleic Acids Research, 1994, Vol. 22, No. 8
14. Cooper, D.N. and Krawczak. M. (1993) Human Gene Mutation. Bios
Scientific Publishers. Oxford.
15. Pallanck, L. and Schulman, L.H. tRNA discrimination in aminoacylation.
In: Transfer RNA in Protein Synthesis, edited by Hatfield, D.L., Lee, B.Y.
and Pirtle, R.M. CRC Press, 1992, p. 279-318.
16. Raftery, L.A., Egan. B.J., Cline, S.W. and Yarus, M. (1984) J. Bacteriol.,
158, 849-859.
17. Kleina, L.G., Masson, J.M., Normanly, J., Abelson, J. and Miller, J.H.
(1990) J. Mol. Biol., 213, 705-717.
18. Normanly, J., Kleina, L.G.. Masson. J.M., Abelson, J. and Miller, J.H.
(1990) J. Mol. Biol., 213, 719-726.
19. Yarus, M. (1982) Science, 218. 646-652.
20. Li, G. and Rice, C M . (1993) J. Virol., 67, 5062-5067.
21. Sharp, P.M. and Bulmer, M. (1988) Gene, 63. 141-145.
22. Brown, C M . , Stockwell, P.A., Trotman, C.N.A. and Tate, W.P. (1990)
Nucleic Acids Res., 18, 2079-2086.
23. Martin, R (1994) Nucleic Acids Res., 21, 15-19.
24. Sharp, P.M., Burgess, C.J., Lloyd. A T . and Mitchell, K.J. Selective use
of termination codons and variations in codon choice. In: Transfer RNA
in Protein Synthesis, edited by Hatfield, D.L., Lee, B.Y. and Pirtle, R.M.
Boca Raton: CRC Press. 1992, p. 397-425.
25. Bemardi, G. (1993) Mol. Biol. Evol., 10, 186-204.
26. Bienz, M., Kubli, E., Kohli, J., deHenau, S., Huez. C Marbaix, G and
Grosjean, H. (1981) Nucleic Acids Res., 9, 3835-3850.
27. Kohli, J. and Grosjean, H. (1981) Mol. Gen. Genet.. 182, 430-439.
28. Mottagui-Tabar, S., Bj6msson, A. and Isaksson, L.A. (1994) EMBO J..
13, 249-257.
29. Arkov, A.L., Korolev, S.V. and Kisselev, L.L. (1993) Nucleic Acids Res.,
21, 2891-2897.
30. Gutman, G.A. and Hatfield, G.W. (1989) Proc. Natl. Acad. Sci. USA, 86.
3699-3703.
31. Eyre-Walker, A.C. (1991) J. Mol. Evol.. 33, 442-449.