Download evidence indicating independent assortment of

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DNA repair protein XRCC4 wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Peptide synthesis wikipedia , lookup

Molecular ecology wikipedia , lookup

Multilocus sequence typing wikipedia , lookup

Metalloprotein wikipedia , lookup

Non-coding DNA wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Gene wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Community fingerprinting wikipedia , lookup

Hepoxilin wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Biochemistry wikipedia , lookup

Antibody wikipedia , lookup

Protein structure prediction wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Monoclonal antibody wikipedia , lookup

Point mutation wikipedia , lookup

Genetic code wikipedia , lookup

Biosynthesis wikipedia , lookup

Transcript
EVIDENCE
INDICATING
FRAMEWORK
INDEPENDENT
ASSORTMENT
OF
AND COMPLEMENTARITY-DETERMINING
SEGMENTS
OF THE VARIABLE
REGIONS
OF
RABBIT LIGHT CHAINS
D e l i n e a t i o n o f a Possible J M i n i g e n e *
BY ELVIN A. KABAT,:~ TAI TE WU,§ AND HOWARD BILOFSKY
From the National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20205; the
Departments of Microbiology, Human Genetics and Development, and Neurology, and the Cancer Center,
College of Physicians & Surgeons, Columbia University, New York 10032; the Departments of
Biochemistry and Molecular Biology, Engineering Sciences and Applied Mathematics, Northwestern
University, Evanston, Illinois 60201; and Bolt Beranek and Newman Inc., Cambridge, Massachusetts
02238
The variable (V) 1 regions of immunoglobulin light chains when aligned for
m a x i m u m homology can be divided into four framework regions (FR) separated by
three complementarity-determining (CDR) (hypervariable [1]) regions or segments
(2, 3). The latter as predicted (1), together with the corresponding three C D R of the
heavy chain (4), form the antibody-combining sites (3-11). Light chains FR1, FR2,
FR3, and FR4 comprise residues 1-23, 35-49, 57-88, and 98-107 and CDR1, CDR2,
and CDR3 residues 24-34, 50-56, and 89-97. If the FR segments were grouped into
sets of identical sequence and the members of each set were traced, it was shown (12)
that members of a given F R I set could be associated with different FR2, FR3, and
FR4 sets. This independent assortment suggested that the FR sets, and by implication
the C D R sets, were under different genetic control, and the hypothesis was put
forward that the individual FR and C D R sets were controlled by minigenes assembled
somatically by recombination at the DNA level (12). A minigene is defined as a
segment of DNA coding for a portion of a domain and which shows evidence of
segregation as a functional unit independent of the rest of the DNA coding for the V
region (13). Because we only assorted FR segments, the findings would be independent
of whether one or two residues of a given C D R assorted with any FR segment. Studies
by Tonegawa et al. with cloned mouse Vx (14, 15) and V~ (16) genes and by Seidman
et al. (17, 18) with mouse V, genes showed that in 12-d-old embryo DNA, genes
* Supported by the National Cancer Institute, National Institute of Allergy and Infectious Diseases,
National Institute of Arthritis, Metabolism, and DigestiveDiseases,National Institute of General Medical
Sciences, and the Division of Research Resources (Contracts NOI-RR-2147 and NOI-RR-8-2158),of the
National Institutes of Health; and by a Cancer Center Support grant to Columbia University, from the
National Cancer Institute, CA 13696.
Recipient of grant BMS-76-81029 from the National Science Foundation.
§ Recipient of grant 5-RO1-GM21482-04 from the National Institutes of Health.
Abbreviations used in this paper: CDR, complementarity-determiningregions; FR, framework region(s);
V, variable.
72
THE JOURNAL OF EXPERIMENTAL MEDICINE • VOLUME 152, 1980
ELVIN A. KABAT, TAI TE WU, AND HOWARD BILOFSKY
73
coding for residues 1-95 of the V region were followed by an intervening sequence.
Other genes for the mouse h-region coded for residues 96-108, termed the J segment
and were followed by an intervening sequence of about 1.2 Kb and then by the Cx
coding segment (15). In mouse V, chains (19, 20), a clone coding for C, contained five
J segments, each separated by an intervening sequence that varied in length from 246
to 310 base pairs and with an intervening sequence o f - 2 . 5 Kb between the ol closest
to C and the C region. Thus, as defined, t h e J segments are clearly minigenes (12, 13).
The amino acid (21) and nucleic acid (22, 23) sequence data on the hinge region
also show that these nucleotides code only for a portion of a domain and therefore
constitute a minigene. It has been suggested (22, 23) that it evolved from a complete
domain by a shift of a splice site followed by mutational divergence of the left end of
the domain to become a portion of an intervening sequence with preservation of
considerable homology in nucleotide sequence with the 5' flanking end of the CH 1
domain.
The assortment principle was also used by Weigert et al. (24) to define the ,J
segment from amino acid sequences with 96 as the site of the recombination, e.g.,
including two residues of CDR3 in ,J. They proposed that some diversity of antibodycombining sites could be generated by V-J joining because residue 96 is highly
hypervariable (1, 2, 25). Because adult myeloma VL DNA clones code for a contiguous
stretch of amino acids 1-107 (or 108) (26), the joining o f t h e J minigene to the DNA
of the region coding for amino acids 1-95 has taken place during embryogenesis.
Because it was subsequently shown (27) that the five clones coding for V~ and the
three coding for V~ all had segments matching at from 6/7 to 9/10 nucleotides plus
the Trp codon at position 35, which could serve as recognition sites for recombination
or insertion of CDR nucleotides between the nucleotides coding for the FR segments
and for the CDR3-J junction, it was considered of interest to examine the amino acid
sequences of rabbit V~ chains for evidence of independent assortment of FR and CDR
segments and to see if a,J segment in the rabbit could be defined by assorting residues
96 and 97 of CDR3 with FR4 as has been found for the mouse.
Assorting rabbit V~ sequences may have certain potential advantages in that the
rabbit populations are not inbred and in that most of the sequencing has been done
on antibodies with specificities largely limited to several pneumococcal type-specific
and to streptococcal grot/p-specific polysaccharides. It thus might be possible to
recognize assortments associated with or not associated with certain antibody specificities in FR and CDR segments. There are also very many complete rabbit V~
sequences, and yet the degree of sequence variation even among antibodies to the
same antigen has not made it possible to define individual amino acid residues as
contacting amino acids. Moreover, the sequence data included one pair of rabbits,
K9-335 and K9-338, littermates (28, 29) producing monoclonal antibody to streptococcal group A variant carbohydrate whose light chains were identical in sequence
from amino acids 1 to 108 and a second pair of completely unrelated rabbits (Ch.
Brandt and ,J. C. olaton. Personal communication.), 311 and 4422, one bred in the
United States and the other in Switzerland; both had produced monoclonal antibodies
to type II pneumococcal polysaccharide whose light chains also were identical in
sequence from amino acids 1-108 (30).
The data provide considerable evidence for independent assortment of FR and
CDR segments and, as also concluded by Braun et al. (31), indicate that t h e J segment
74
FRAMEWORK AND (2OMPLEMENTARITY-DETERMINING SEGMENTS
in the r a b b i t as in t h e m o u s e m a y i n c l u d e t w o residues o f C D R 3 a n d m a y c o n t r i b u t e
to the g e n e r a t i o n o f a n t i b o d y diversity.
Materials and Methods
All rabbit V, sequences were contained in Sequences of Immunoglobulin Chains (25); they
represent the work of numerous groups of investigators and original references may be found in
(25). Sequences that have not been determined for an entire FR segment were generally not
used except when an invariant Cys 23 was lacking, and the sequences fell into an F R or C D R
set with other members. All chains with the identical sequence for an FR or C D R segment
were grouped together as a set. Members of a set are enclosed in a box. The prototype sequence
was K4820 and was selected from the set with the most members of FR2, the most commonly
occurring preserved segment found to date in 1 h u m a n V, IV, 20 mice (6 B A L B / c and 14
NZB), and 13 rabbit sequences (12, 13). Above the prototype set are listed all amino acid
positions and residues at which the other sets differed from the prototype (32). Above all other
sets are listed only the positions and amino acid residues at which the set differed from the
prototype. It is thus easy to see which residues in each set differed from the prototype and also
by comparing any two sets to see at which positions they differ from one another (32). If the
amino acid in any set differed from the prototype at a given position and if this involved a two
base change, this is indicated by an asterisk (*). If the differences involved deletions or insertions
or when a three base change was involved this is indicated by a dagger (t"). T o define a potential
J segment and because of length variations in C D R 3 , the last two residues of C D R 3 in each
sequence were renumbered as 97E and 97F and assorted with FR4 as a J segment (97E, 97F107) comparable with what had been observed in the mouse by assortment (24) and by nucleic
acid sequencing of clones (26). In rabbit 2717 residue 106A had been misaligned and was made
residue 107. This homologized all subsequent residues. It should be noted that FR4 is identical
for all rabbit light chains except for K16-167 in which T h r replaces Gly at position 100. Chains
that have not been sequenced completely are given in parenthesis with the number of
unsequenced positions. Above the sets, unsequenced positions are indicated by a question mark.
In assembling Fig. 1, the individual sets as typed were cut out and arranged to produce a
reasonable minimum number of crossing lines in connecting the members of the sets. If this is
not done (24, 33) patterns of greater complexity are generally seen that do not give a true
representation of the extent of the assortment. This procedure not only makes for clarity of the
figures, but is also essential conceptually when there is no reason to arrange the sets in any
specific order. Indeed, when one attempts to rearrange the sets in some defined order to test an
hypothesis, the finding of increased complexity of the assortment may provide data consistent
with or conflicting with the hypothesis. Thus our original simplest assortment patterns of FR
segments of human, mouse, and rabbit V, chains (12) became extremely complex when the sets
were ordered in terms of the likelihood of the amino acid differences in the FR segments being
ascribable to somatic mutation (32). This increase in complexity of the assortment patterns
indicated that somatic mutation was not making a significant contribution in generating the
different FR sets (32).
All rabbit chains with the prefix K were from breeding colonies at The Rockefeller University
(New York) and the Basel Institute of Immunology (Basel) and are related. Detailed geneologies
may be found in Braun et al. (29, 31); the prototype rabbit K4820 was a progenitor of the
Basel colony. It and K4878 and K4872 came from the State Serum Institute, Copenhagen,
whereas K19, K20, K23 and K31 came from the Rockefeller University colony. Thus a portion
of the data comes from families produced by matings of brother-sister and of other close
Fro. 1. Assortment of FR, (2DR regions, and a proposed J segment of rabbit V, chains. O, antitype
III pneumococcal polysaccharide; O, antitype VIII pneumococcal polysaccharide; A, antistreptococcal group A variant carbohydrate; &, antistreptococcal group C carbohydrate; x, anti-Micrococcus
lysodeikticus; +, anti-p-azophenylarsonate; ~, antitype II pneumococcal polysaccharide; I--I,antidigoxin; I , anti-p-azobenzoate; *; two base changes; t, three base changes; :~, Gin Gin Asp inserted
between residues 58 and 59 and Val between 85 and 86; ?, residue not known; -, deletion; ¶, residues
43 and 44 missing; #, an unidentified residue is present at position 97F. K9-335 and K9-338 were
litter mates. 311 and 4422 are unrelated. One was purchased from a dealer in the U. S. and the
other from a dealer in Switzerland.
F RX
CD R1
1-23
24-5tt
OAla
13Al=
25A1.
4~e~
17G]y
27B
19V=1
20Th=
286.=
9A1.
ZOSer
.1v=1
22Lys
FR2
~OLy=
5XAIs
52Se~
~
27Glu
32~1u*
7sv~L
53Thr
55AXa
79~n
~9Cln
9~
81A1.
9OCln
7~lu
84Set
85Thr
92~.
93Lea
97B
eTC
87Tyr
94~Iy
95ASh
97n
-
91Tyr*
94S.+r
71Fh
74Pro
o
77A~p
4~P~O
4bLeu
48£1e
5 9 s~=
7~1=
74~hr
84AZa
SgSe=
79G1u
7&Thr
84A1~
~OArg
13nlu
l~Pro
31Set
28A~X
3OTyr--
32Tyr
3OTyr*
31Set
32G~y~
34Ala
91Ser
92Tll~
596~=
Kg--335
K9 33~
1
97f. thr
9vr~vr*
97~Cly~
97F?,,
9 ~ y ~
9 ose~*
7Skeu
96Asp
91Thr*
97C]y--
94ASh*
95Ser
97CASh
97D~y=
3381
81Asp
~
_~
l&ta
0
27BV~lT
28Tyz
lop=o
12Set*
22Ser*
2Leu
9S~
34Ala
29L~.
7~1n
84AI=
69Thr
83AXa
74Thr
Z7ASe~
2va£
~ 89PS =
eo
~
llger
~
~
2=A "n
I~O-so~
2Phe
+
27BLe~
~
T
69Thr
83AZa
7o~1~
84~1~
77~1Y
8~ZZe
89Arg
90V8!.
9 2 ~
93Thr*
90La
9 4 A s ~
2STy r
*
AX
69Thr
83AIa
S3~
59Set
63Thr
*
.
u
~
9 ~ I Y *
93Tyr*
92Tyr
95Ser
74Set
8 ~ p
80Le~
9OcXy*
95S
96Set
.
.
.
.
.
.
.
.
.
.
.
9 7 ~ 1 a *
97
Fsec~
.
?
~;~:~;
.
9~Cy~
63A=~
78Leu
~3A1.
6~Thr
"
ol
t 3315
~os~
9~clg
5 ~
2Tyr*
?
92Th
3~Ala
27ASe=*
3OAon*
3=Arg*
34A1.
27BY=l*
4L~
22Asn
Z7Glu
32Trp*
30Set
34Ala
'
17~p
22A8~
27BVal
--
95
~9$er
69Th~
8rAsp
83Ala
77Gly
~ r l ~
.
g3Asn*
~ . . . .
97val
70~sp
83A~a
93Tyr~
~
22Ser*
r
97EV~
1
~TF~ly
32Ash*
28Asn
30Tyr
tAZ~
lOOThr
]OA.n*
27~ValT
12Set*
22ASh
2Ty=*
O~lle
/
3 0 A . ~
29GIy*
]Ala
BgLys
=
29Se~
\
4Leu
9 6 A ~
3OS~r*
. . . . . . .
27~n
l~Olu
Z4p=o
9*Thr*
)4Ala
3lThr
32Ty=
9Set
2
i
03Al=
79~1u
2Phe
9~er
lZhys*
125er*
--.
--
K30
267
(K~7-3Q6)9
he
69Thr
7 ~ 1 .
2~'he
--
59P~o
63Lys
4
o
_~
2VaZ
-~
J
97E-107
34Ala
"1
av=~
I Kz~
CDR5
89-97D
32set
3b~yr
40pro
42Cin
13368
FR3
57-88
~V~l
345e=
[ K482O
lZSer*
50-56
29Lle
--
28ASu
9set
CDR2
35-49
~ . ~ w ~
÷
97EArn
| )368
97
FASn
.|
74Thr
S4Ala
90c1y*
q[Ser
78Leu
8?Phe
9ZA~X
95cly*
97ETh~
69Thr
74Thr
8
90GlY*
92A~p
95Ala*
97ETyr*
97FThr*
97EVal
97nhr*
Ala
O~AXa
~
96Thr*
2STy~
69Thr
81Amp
8~Ala
7OGXn
89Ala
95Glu
96Thr--
93Thr
8~kla
~{KZT--306)AII
l~Zn*
9S~=
~2Ser*
22A.n
25Set
29Glu
27~V=17
28Tyr
32Arg*
•
~
)IGly*
36Phe
4~Ar
46~1y*
Z3Glu
~l
19135
~
28V.1--
|K6--139
A
o
+
4L~U
1
~
15Be t
K6--I39
K33--3BO
(~11147)2
1
•
+
zzs r * ~
t~G]&
22A~.
-
50~hr
~ z
I (xp-O2
ml
"t
|3374
40.
"1
4~--497
1(3T74)8
i
9S.~
*oP~o
[303
s~=~
ZZ~n
~1
]3V~1
2 L ~
9see
lOPro
t~53
34A1.
II1|
2SS~
29~h=
e7~v.l*
28~y~
347
1(3o3)~
17A.x
22Thr
28Thr
~I
~1
29~
31~.x
32Cly.
34AZa
I ~53
89¢;1u
80?
O~Ata
907
70Leu
79Glu
~7--617
69The
7 ~ n
96Ash
~
Y
97EAs~
97 FT~,=*
94Ser
05Clx
JCxP-*)z
If
90Set*
J(xp-~)6
mI
93Ser*
9411.*
95Set
O1Tyr*
8 ~ s p
83Ala
84AIa
89Ala
9 ~ l y *
04set
95Se=
93Set
|3T74
97.Ss
76 T r
1(3T74}~
50--56?
:~
;{3T74)~11
--~
92Tyr
13374
e
I
--I
e I
~
1(3T74)1
.J
~I
83Ala
70Leu
43ArS
z7Bxz~+
26Th~
97Va]~
97M~p
83Ala
8&~Z.
el
I 3374
o l
29~.=
30~.=
27~Gly~
27Ealt
14pro
92Trp
~
93Thrl
04~o
~
7@Le~
79Glu
IN I
66Arg
69Thr
74T~r
74Thr
1 5 3
34AXa
25S~=
26Th=
1Z717
9ZTY~*
~
3OA.n.
32A~g*
~6Val
50gly
53Ly=
|4153X
~
4
~ l ~ - - ' - - - ~
])All
~
(
4
1
50Thr
53Ser
84A1a
I
57--6z"
5
3
]
Z
)
63Thr
69Thr
4 6 ~ a
IJ|
77~*y
83Ala
86Aia
el
~2Ty=
J2717
6O?
69Thr
o ~ l y
94Ser
9SASp
=m
32Arg*
1 2 s ~ - -
8~AXa
I(xP-*)~
~x~=
50Ar~
(41531i)3
z c . ~ *
76?
S3AIa
7~1n
62Phe
45V.1
13322A
~lSe=
-
I
I3T74
27ASer*
27BVaX *
-~
mill
57--00Z
69Thr
92?
9 ~ l y *
Im
&
(K19)1
12Ser*
4Leu
OSer
Ixp-1
1337~
~
f
74Thr
94Tyr~
32ASh*
/
2VaZ
~
5g
89Leu
34Ala
29Tyr*
3ts~-
o
j413s
81Asp
83Ala
84AIm
59set
63~er
6~Cly
69Th=
32Aex
~OPhe"
31Asx
69Th¢
74Thr
77Gly
70ASp
s
90S~r*
91Ala
7~1x
74Thr
7
1
6
•
~J
79Clx
89Ciy*
93Tyr*
81b.p
83A1~
84AXa
Q~Zy*
91Ala
92A~p
94Thv
95Gly*
~7~Thr
J2717
97FS~r~
IJj
this page intentionally left blank
77
ELVIN A. KABAT, TAI TE WU, AND HOWARD BILOFSKY
relatives, whereas the rest of the data derives from outbred and r a n d o m l y selected animals. T h e
inbred animals were immunized largely with streptococcal group C a n d group A variant
streptococci, whereas the n o n i n b r e d animals received pneumococcai polysaccharides, Micrococcus
lysodeiktieus, etc.
Results a n d Discussion
Table I presents the findings indicating independent assortment of FR and CDR
segments and defining a possible J segment that contains two residues of CDR3 as
established earlier in the mouse (19, 20, 24) and as proposed for the rabbit (31).
Among the important findings are the following:
(a) The FR sets that contain more than a single member generally show multiple
antibody specificities. This is true for all FR 1 sets, and for the largest FR2 set. One
FR2 set with four members and the two FR3 sets containing the pairs of identical
sequences are limited to a single antibody specificity.
(b) The CDR sets tend to vary. Sets with multiple members in CDR1 have the
same antibody specificity, whereas the sets in CDR2, the shortest CDR, excluding the
pairs with identical V-region sequences, have members with different antibody
specificities. This is perhaps not surprising because the antibody specificities are
restricted to but a few antigens and because many members of these sets are related.
The one CDR3 set of two related members, K16-167 and K27-306 (cousins), but that
TABLE I
Frequency Distribution of Members of Multiple Sets in Relation to Amount of Sequence Data
Segment
Amino acid residue
numbers
Number of sequences
Number of sets
Sets/sequences, %
Sets with multiple
members
Sets with single merebers
Number of sequences
in sets with multiple
members
Number of sequences
in sets with multiple
members/total
number
of
sequences, %
Sets with single merebers/total number
of sets, %
Number of sets/numbet of amino acid
residues in segment
FRI
CDRI
CDR3
J
1-23
24-34
35-49
50-56
57-88
89-97D
97E, 97F-107
5l
24
47
11
43
34
79
4
28 (+1")
12 (+3*)
43
3
26
13 (+4*)
50
4
27 (+1")
26
96
2
25
23
92
3
26
18
62
4
13
30
9
9
24
20
14
38
14
21
17
4
4
12
75
32
75
65
15
16
46
54
85
75
70
88
87
77
1.05
3.1
FR2
0.8
CDR2
1.8
FR3
0.8
1.8
1.5
TheJ segment is made up by taking the last two residues of CDR3, called 97E, 97F, plus FR4 by analogy
with the mouse J sequence.
* No sequence data or sequence incomplete.
78
FRAMEWORKAND COMPLEMENTARITY-DETERMININGSEGMENTS
differed in sequence in the other FR and CDR segments had the same antibody
specificity; CDR2 of K27-306 was not sequenced.
(c) The most striking evidence for independent assortment is the finding that the
monoclonal antitype II pneumococcal light chains with identical V-region sequence
from unrelated outbred rabbits 311 and 4422, one from Switzerland, the other from
the United States (30), had an FR1 identical with two other antistreptococcal group
A variant antibodies from two other rabbits K16-167 and K27-489. Unlike K16-167
and K27-489, 311 and 4422 had the preserved FR2 set as did the prototype K4820
originating in Denmark; this set contained segments of chains of antibodies of three
specificities different from type II, one chain of no known specificity, as well as two
other chains with type II specificity.
(d) The pair of littermates with an identical sequence K9-335 and K9-338 assorted
in the same set in CDR2 with three related rabbits of the same specificity and with
three unrelated outbred rabbits 3381, BS-1, and K25--these having another specificity. It is of interest that a second population of antibodies of the same specificity from
one member of the pair, rabbit K9-335I, had different FR1, C D R I , shared the
preserved FR2 set, and yet was in a CDR2 set that contained antibodies of four
specificities.
(e) The J segment contained 18 sets. This is a large number, especially because the
FR4 portion, residues 98-107, was identical in all but one sequence, K16-167, which
had Thr at position 100 instead of Gly. All members of the prototype set had the
same specificity as did the set with K16-167, and, indeed, except for this substitution,
it would have fallen into the prototype J set because it also has Ile-Val at positions
97E and 97F. O f the three sets with two members, one set had a chain 3547 not
known to have antibody activity (34) although it had been immunized with streptococcal vaccine. The other had antistreptococcal group A variant activity like the
prototype set; another set had one chain with antipneumococcal type I!I activity and
the other with antistreptococcal group A variant activity; the third set had the two
identical chains from unrelated rabbits with type II antipneumococcal activity.
(f) O f the remaining 13 sets of J segments that differed from each other and from
the prototype at one or both of the two positions 97E, 97F which could contribute to
the generation of diversity, 8 were from light chains of antibodies to type III
pneumococcal polysaccharide, 2 to p-azobenzoate, a n d 1 each to type VIII pneumococcal polysaccharide, to group C streptococcal polysaccharide and to M. lysodeikticus.
The variations in sequence at positions 97E and 97F for the J segments from chains
with type III antipneumococcal antibody give no indications of how they could
influence site specificity, but there is no evidence that the different type III sites are
necessarily the same.
(g) In me mouse, the first three nucleotides in the intervening sequence following
the codon for amino acid 95 are CCC, and there appears to be considerable
preservation of nucleotide sequences at the junction of J and CDR3 (26). Sakano et
al. (19), Max et al. (20), and Weigert et al. (24) have proposed that some diversity in
J could have been generated by intracodon recombination when the J nucleotides are
joined to the rest of the V region. If one attempts such intracodon recombinations
assuming CCC to be preserved at the beginning of the intervening sequence in the
rabbit as in the mouse, the prototype set Ile Val would generate Leu or Pro at position
97E, neither of which has been found. Indeed, with CCC by the proposed intracodon
ELVIN A. KABAT, TAI TE WU, AND H O W A R D BILOFSKY
79
recombination it would be possible to produce, in addition to Leu and Pro, only His,
Gin, and Arg, and only Arg has been found at position 97E in one J segment. Thus
the amount of diversity that may be introduced by VJ joining in the rabbit is not
very great, and 11 different amino acids have been found at position 97E in the 18 J
segments. The outbred rabbit may prove, therefore, to have more J , minigenes than
the inbred mouse 2 or a more complex mechanism to generate diversity (35) as also
proposed for the variable portion of Ig heavy chain (33). The precise location of the
N-terminal end of J segments cannot be identified from amino acid sequences alone
(24), and nucleotide sequence data will be needed to determine whether the two last
amino acid residues in CDR3 should indeed be included with FR4 to form the J
segments of rabbit kappa light chains, as has been found for mouse light chains.
Indeed much of the diversity generated by intracodon recombination in the mouse
derives from an extra Pro between residues 95 and 96 in a single NZB light chain
PC7132 (24) which was assigned to J, thus permitting intracodon recombination
involving six nucleotides of the intervening sequence following residues 95. It is
possible that such recombination between the other FR and (2DR minigene segments
could also contribute somatically to diversity.
(h) An important parameter in relation to any mechanism for the generation of
diversity is the number of CDR and J sets in relation to FR sets. Table I is an attempt
to estimate this. If there are many more CDR sets than FR sets, this would have
substantial implications for the generation of diversity. Although the rabbit amino
acid sequence data are the most extensive, they nevertheless represent a very highly
selected group of sequences because of the limited number of antibody specificities
that have been examined. The data are also skewed in that many more FRI and
CDR1 sequences have been determined as compared with the rest of the chain. The
tendency to have multiple sets will also be a function of the length of each segment
comprising the set. Table I attempts to obtain an estimate of the relative numbers of
FR and CDR sets.
If the CDR are compared with the FR, it is clear, as seen from the last row in
Table I, that the number of sets is a function of the number of amino acid residues in
the FR or CDR segment, and this must be taken into account. Even without this,
however, it is clear comparing FR 1 and FR2 with CDR 1 that there are almost twice
as many sets in CDR1 as in FR1 and FR2 and that a much smaller proportion of sets
in CDRI has multiple members despite the selection for only a few antibody
specificities. If these values are corrected for length of the segment, the difference in
number of sets between (]DR ] relative to FR 1 and FR2 increases two to three times.
The same results are seen if CDR3 and the proposed J segment are compared with
FRI and FR2; in each instance, the percentage of sequences in sets with multiple
members is much lower for CDR3 and J. These differences in number of sets would
be further increased if adjusted for length differences with CDR2, which differs
strikingly in length from the other segments and which shows much less variability in
rabbit V, chains than is seen in other species (36). The proportion of sequences in sets
2 Rudikoff, S., D. M. Rao, C. P. J. Glaudemans, and M. Potter (personal communication) have suggested
for mouse antigalactan Vx light chains that an additional gene containing an A as the first nucleotide of
the intervening sequence following the codon for amino acid 95 could also account for the additional
diversity.
80
FRAMEWORKAND COMPLEMENTARITY-DETERMINING SEGMENTS
with multiple members is less than in FR 1 and FR2. The number of sets would also
almost be doubled if adjusted for length differences.
(i) The data on FR3, the segment of longest length, do not appear to fit with the
other segments. T h e 26 sets each have but a single sequence except the 2 sets each
with the completely identical V~ chains. However, there is some indication that FR3
can be divided into two segments which show some suggestion of assortment or
recombination. Such assortment can be seen if F'R3 is separated into two segments
57-68 and 69-88; the data are generally similar for 57-69 and 70-88 or 57-70 and
71-88. Fig. 2 shows the data for 57-68 and 60-88; it is clear that m a n y of the
individual chains in Fig. 1 can be grouped into sets and that these sets show evidence
of assortment. It may be of significance that the three-dimensional structure of
immunoglobulin light chains shows residues 68 and 60 to be at a bend connecting
two//-strands in the mouse V, light chain Fv dimer REI (5, 9).
The two sets of .57-68 contain 13 and 5 chains and assort with six sets of 60-88
containing two or three members each. Thus the separation of FR3 into two segments
has substantially reduced the 26 distinct FR3 sets.
Although the clones from 12-d-old mouse embryo DNA indicate that nucleotides
coding for amino acids 1-05 occur as a contiguous segment (14, 17, 18, 26), the
evidence for assortment, by recombination with or insertion into nucleotides coding
for FR segments, of nucleotides coding for C D R segments suggests that this minigene
mechanism in addition to diversity generated by the joining of residues 1-95 to the J
minigene is fundamental to the generation of diversity.
57-68
69-88
70Gin78Val
74Pro79Gin
77Asp81Ala
59Pro
K482o
|
K9-335 F '
IK9-338 |
iK29-2131
IBS-5 )
3547
l
)311 l
14422
,K16-167|
'~
~~
K49-501I--'--
3322-B
,K30-3671
B4B5 J
\
74Thr
lK732"50J
70Glu77Gly
'~
74Thr81Asp
59Ser
78Leu79Glu
"
70Glu79Glu
78Leu
Flo. 2. Assortmentof FR3 into two segments.
ELVIN A. KABAT, TAI TE WU, AND HOWARD BILOFSKY
81
The recent report of Schilling et al. (33) of a segment generating diversity in CDR3
N-terminal to the J segment of anti-al ---* 3 dextran hybridomas and bearing
individual idiotypic determinants provides additional evidence consistent with the
minigene hypothesis.
It is important to distinguish between the generation of diversity and the generation
of complementarity differences responsible for antibody specificity, e.g., for noncovalent binding of an antigenic determinant in the antibody-combining site. Although
positions 96 and 97 of CDR3 of VL are coded for by the J minigenes (195 20) and
although position 96 shows the highest variability (1, 2, 25), position 96 has been
identified only once as a contacting residue in the high resolution x-ray crystallographic structures thus far reported (5-11) and in initial model building studies of
antibody-combining sites without (37, 38) or with (39) nuclear magnetic resonance
data in which the amino acid residues of the CDR sequences known to bind ligands
were introduced on to a framework established by x-ray crystallographic studies.
Padlan et al. (40) established residue 96 as a contacting residue for phosphorylcholine
in the mouse myeloma protein McPC603. In mouse V, chains position 97 is essentially
invariant (25). Rudikoff et al. ~ also consider that in the fll ---* 6 galactans, the
alternative amino acids at position 96 do not contribute to complementarity. Thus
the intracodon recombination proposed (19, 20, 24) as a mechanism for V-J joining
and for reducing the number o f J minigenes might create sequence diversity without
contributing significantly to those complementarity differences which would make
for various antibody specificities. Indeed, even if antibody site complementarity were
to be generated by position 96 as in McPC603 when an actual J sequence was
assembled somatically by joining to the rest of the V region, it would not necessarily
follow that the other amino acids if created at position 96 by intracodon recombination
would necessarily function as CDR residues in CDR3 although conceivably they
might influence other residues in CDR3 conformationally so that they become
contacting. Indeed, although the discovery of the J minigene and of somatic assembly
at the DNA level between the 12th d of embryonic life and the adult (12, 15, 17, 18,
26) is a seminal development in molecular genetics, it has tended to focus attention
on position 96 and away from the other CDR residues that are crucial to the
generation of antibody diversity, of idiotypic specificity, and of their interrelations.
Summary
Amino acid sequences of rabbit light chains "show considerable evidence of independent assortment of framework (FR) and complementarity-determining (CDR)
segments. This suggests that they are coded for by independent genetic units (minigenes) and that individual light chains are assembled somatically by recombining
these units. Identical FR sets with multiple members generally comprise chains with
different specificities, whereas identical CDR sets tend to have chains of a single
specificity. A J segment, which, by analogy with mouse light chains, is made up of the
last two residues of CDR3 plus all of FR4, contained 18 different sets and could
contribute to diversity generated by CDR3.
The longest segment, FR3, had a very large number of sets. Evidence is presented
showing that the number of sets could be substantially reduced by permitting FR3 to
82
FRAMEWORK AND COMPLEMENTARITY-DETERMINING SEGMENTS
be formed by two independently assorting segments comprising residues 57-68 and
69-88.
Receivedfor publication 22 February 1980.
References
I. Wu, T. T., and E. A. Kabat. 1970. An analysis of the sequences of the variable regions of
Bence Jones proteins and myeloma light chains and their implications for antibody
complementarity.J. Exp. Med. 132:21 I.
2. Kabat, E. A., T. T. Wu, and H. Bilofsky. 1976. Variable Regions of Immunoglobulin
Chains, Tabulations and Analyses of Amino Acid Sequences. Bolt Beranek and Newman
Inc., Cambridge, Mass.
3. Kabat, E. A. 1978. The structural basis of antibody complementarity. Adv. Protein Chem. 32:
1.
4. Kabat, E. A., and T. T. Wu. 1971. Attempts to locate complementarity-determining
residues in the variable positions of light and heavy chains of immunoglobulins. Ann. N. Y.
Acad. Sci. 190.382.
5. Davies, D. R., E. A. Padlan, and D. Segal. 1975. Immunoglobulin structures at high
resolution. Contemp. Top. Mol. Immunol. 4:127.
6. Padlan, E. A. 1977. Structural basis for the specificity of antigen-antibody reactions and
structural mechanisms for the diversification of antigen-binding specificities. Q. Rev. Biophys.
10,35.
7. Saul, F. A., L. M. Amzel, and R. J. Poljak. 1978. Preliminary refinement and structural
analysis of the Fab' fragment from human immunoglobulin New at 2.0/k resolution. J.
Biol. Chem. 25:585.
8. Fehlhammer, H., M. Sehiffer, O. Epp, P. M. Golman, E. E. Lattman, and W. Steigemann.
1975. The structural determination of the variable portion of the Bence-Jones protein Au.
Biophys. Struct. Mech. 1:139.
9. Epp, O., E. E. Lattman, M. Schiffer, R. Huber, and W. Palm. 1975. The molecular
structure of a dimer composed of the variable portions of the Bence-Jones protein REI
refined at 2.0/~ resolution. Biochemist~. 14:4943.
10. EdmuncIson, A. B., K. R. Ely, R. L. Girling, E. E. Abola, M. Schiffer, F. A. Westholm, M.
D. Fausch, and H. F. Deutsch. 1974. Binding of 2,4-dinitrophenyl compounds and other
small molecules to a crystalline X-type Bence Jones dimer. Biochemist01. 13:3816.
11. Wang, B. C., C. S. Yoo, and M. Sax. 1979. Crystal structure of Benee Jones protein Rhe
(3 A) and its unique domain-domain association.J. Mol. Biol. 129.657.
12. Kabat, E. A., T. T. Wu, and H. Bilofsky. 1978. Variable region genes for the immunoglobulin framework are assembled from small segments of DNA-a hypothesis. Proc. Natl. Acad.
Sci. U. S. A. 75:2429.
13. Kabat, E. A., T. T. Wu, and H. Bilofsky. 1979. Evidence supporting somatic assembly of
the DNA segments (minigenes), coding for the framework, and complementarity-determining segments of immunoglobulin variable regions.J. Exp. Med. 14~1299.
14. Tonegawa, S., A. M. Maxam, R. Tizard, O. Bernard, and W. Gilbert. 1978. Sequence of
a mouse germ-line gene for a variable region of an immunoglobulin light chain. Proc. Natl.
Acad. Sci. U. S. A. 75:1485.
15. Brack, C., M. Hirama, R. Lenhard-Schuller, and S. Tonegawa. 1978. A complete immunoglobulin gene is created by somatic recombination. Cell. 15:1.
16. Lenhard-Schuller, R., B. Hohn, C. Brack, M. Hirama, and S. Tonegawa. 1978. DNA
clones containing immunoglobulin • chain genes isolated by in vitro packaging into phage
)~ coats. Proc. Natl. Acad. Sci. U. S. A. 75:4709.
ELVIN A. KABAT, TAI TE WU, AND HOWARD BILOFSKY
83
17. Seidman, J. G., and P. Leder. 1978. The arrangement and rearrangement of antibody
genes. Nature (Lond.). 276.'790.
18. Seidman, J. G., E. E. Max, and P. Leder. 1979. A K-immunoglobulin gene is formed by
site-specific recombination without further somatic mutation. Nature (Lond.). 280:.370.
19. Sakano, H., K. Hiippi, G. Heinrich, and S. Tonegawa. 1979. Sequences at the somatic
recombination sites of immunogiobulin light-chain gene. Nature (Lond.). 280:.288.
20. Max, E. E., J. G. Seidman, and P. Leder. 1979. Sequences of five potential recombination
sites encoded close to an immunoglobulin K constant region gene. Proc. Natl. Acad. Sci. U. S.
A. 76:3450.
21. Frangione, B., and E. C. Franklin. 1979. Split immunoglobulin genes and human heavy
chain deletion mutants. J. Immunol. 122:1177.
22. Tucker, P. W., K. B. Marcu, N. Newell, J. Richards, and F. R. Blattner. 1979. Sequence
of the cloned gene for the constant region of murine y2b immunoglobulin heavy chain.
Science (Wash. D. C ). 206:1303.
23. Honjo, T., M. Obata, Y. Yamawaki-Kataoka, T. Kataoka, T. Kawakami, N. Takahashi,
and Y. Mano. 1979. Cloning and complete nucleotide sequence of mouse immunoglobulin
yl chain gene. Cell. 18:559.
24. Weigert, M., L. Gatmaitan, E. Loh, J. Schilling, and L. Hood. 1978. Rearrangement of
genetic information may produce immunoglobulin diversity. Nature (Lond.). 276:785.
25. Kabat, E. A., T. T. Wu, and H. Bilofsky. 1979. Sequences of Immunoglobulin Chains.
National Institutes of Health Publication 80-2008.
26. Bernard, O., N. Hozumi, and S. Tonegawa. 1978. Sequences of mouse immunoglobulin
light chain genes before and after somatic changes. Cell. 15:1133.
27. Wu, T. T., E. A. Kabat, and H. Bilofsky. 1979. Some sequence similarities among cloned
mouse DNA segments that code for h and • light chains of immunogiobulins. Proc. Natl.
Acad. Sci. U. S. A. 76:4617.
28. Braun, D. G., and H. Huser. 1977. Rabbit antipolysaccharide antibodies: structure and
genetics. Progr. Imrnunol. 3:255.
29. Braun, D. G., H. Huser, and W. F. Riesen. 1976. Rabbit antibody light chains: selective
breeding narrows variability in framework and complementarity determining residues. Fur.
J. Imraunol. 6:570.
30. Brandt, D. Ch., and J.-C. Jaton. 1978. Identical VL region sequences of two antibodies
from two outbred rabbits exhibiting complete idiotypic cross-reactivity and probably the
same antigen binding site fine structure.J. Immunol. 121:1194.
31. Braun, D. G., H. Huser, R. Knecht, and J. K. Wright. 1979. Amino acid sequences of
antibody light chain variable regions of pedigreed rabbits: K light chain K49-501 (allotype
b4) anti-streptococcal group A-variant polysaccharide antibody. Hoppe-Seyler's Z. Physiol.
Chem. 230:.663.
32. Kabat, E. A. 1979. Implications of the assortment of framework segments for the assembly
of immunoglobulin VL and Vn regions and the generation of diversity. In Cells of
Immunoglobulin Synthesis. B. Pernis, and H. J. Vogel, editors. Academic Press, Inc., New
York. 33.
33. Schilling, J., B. Clevinger, J. M. Davie, and L. Hood. 1980. Amino acid sequence of
homogeneous antibodies to dextran and DNA rearrangements in heavy chain V-region
gene segments. Nature (Lond.). 283:35.
34. Thunberg, A. L., and T. J. Kindt. 1976. Amino acid sequence of rabbit light chains:
variable region of a light chain from a homogeneous immunogiobulin raised by streptococcal immunization. Biochemist~. 15:1381.
35. Mage, R. G. The phenotypic expression of rabbit immunoglobulins: a model of complex
regulated gene expression and cellular differentiation. Contemp. Top. Mol. Immunol. In press.
84
FRAMEWORK AND COMPLEMENTARITY-DETERMINING SEGMENTS
36. Haber, E., M. N. Margolies, L. E. Cannon, and M. S. Rosemblatt. 1975. Restricted clonal
responses: a tool in understanding antibody specificity. Miami Winter Syrup. 9,393.
37. Davies, D. R., and E. A. Padlan. 1976. Correlations between antigen binding specificity
and the three-dimensional structure of the antibody combining site. In Antibodies in
Human Diagnosis and Therapy. E. Haber and R. M. Krause, editors. Raven Press, New
York. 119.
38. Padlan, E. A., D. R. Davies, I. Pecht, D. Givol, and C. Wright. 1976. Model building
studies of antigen-binding sites: the hapten-binding site of MOPC 315. Cold Spring Harbor
Symp. Quant. Biol. 41:627.
39. Dwek, R. A., S. Wain-Hobson, S. Dower, P. Gettins, B. Sutton, and S. J. Perkins. 1977.
Structure of an antibody combining site by magnetic resonance. Nature (Lond.). 266:31.
40. Padlan, E. A., D. R. Davies, S. Rudikoff, and M. Potter. 1976. Structural basis for the
specificity of phosphorylcholine-binding immunoglobulins. Immunochemist~. 13:945.