Download Cell Progenitors B Charge That Are First Established in Early

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Polyclonal B cell response wikipedia , lookup

Molecular mimicry wikipedia , lookup

Immunomics wikipedia , lookup

Transcript
This information is current as
of June 12, 2017.
Development of the Expressed Ig CDR-H3
Repertoire Is Marked by Focusing of
Constraints in Length, Amino Acid Use, and
Charge That Are First Established in Early B
Cell Progenitors
Ivaylo I. Ivanov, Robert L. Schelonka, Yingxin Zhuang, G.
Larry Gartland, Michael Zemlin and Harry W. Schroeder, Jr.
Supplementary
Material
References
Subscription
Permissions
Email Alerts
http://www.jimmunol.org/content/suppl/2005/06/06/174.12.7773.DC1
This article cites 40 articles, 14 of which you can access for free at:
http://www.jimmunol.org/content/174/12/7773.full#ref-list-1
Information about subscribing to The Journal of Immunology is online at:
http://jimmunol.org/subscription
Submit copyright permission requests at:
http://www.aai.org/About/Publications/JI/copyright.html
Receive free email-alerts when new articles cite this article. Sign up at:
http://jimmunol.org/alerts
The Journal of Immunology is published twice each month by
The American Association of Immunologists, Inc.,
1451 Rockville Pike, Suite 650, Rockville, MD 20852
Copyright © 2005 by The American Association of
Immunologists All rights reserved.
Print ISSN: 0022-1767 Online ISSN: 1550-6606.
Downloaded from http://www.jimmunol.org/ by guest on June 12, 2017
J Immunol 2005; 174:7773-7780; ;
doi: 10.4049/jimmunol.174.12.7773
http://www.jimmunol.org/content/174/12/7773
The Journal of Immunology
Development of the Expressed Ig CDR-H3 Repertoire Is
Marked by Focusing of Constraints in Length, Amino Acid
Use, and Charge That Are First Established in Early B
Cell Progenitors1
Ivaylo I. Ivanov,2* Robert L. Schelonka,2† Yingxin Zhuang,‡ G. Larry Gartland,*
Michael Zemlin,§ and Harry W. Schroeder, Jr.3*‡
I
n jawed vertebrates, the adaptive immune system is characterized by the exponential diversity of its Ag receptors (1–5).
In contrast to the receptors of the innate immune system that
bind relatively invariant pathogen-associated epitopes (6), diverse
Ag receptor repertoires allow recognition of novel or divergent
epitopes on pathogens or toxins.
The diversity of Ig, the BCR, is primarily the property of the V
domains of the H and L chains (1–5). Diversity is asymmetrically
distributed within each V domain (7, 8). In the primary sequence,
three intervals of hypervariability, termed CDRs, are separated
from each other by four relatively conserved intervals, termed
framework regions (FRs).4 In the native form of the Ab, the FRs
create a scaffold that supports the H and L chain CDRs. These
CDRs are juxtaposed to form the Ag binding site. CDR-H1, -H2,
-L1, and -L2 create the outside border; CDR-L3 forms the base;
and CDR-H3 lies at the center of this Ag binding site. CDR-H1,
-H2, -L1, and -L2 are entirely encoded by the V gene segment, and
Departments of *Microbiology, †Pediatrics, and ‡Medicine, University of Alabama at
Birmingham, Birmingham, AL 35294; and §Department of Pediatrics, Philipps Universität Marburg, Marburg, Germany
Received for publication December 22, 2004. Accepted for publication March
22, 2005.
The costs of publication of this article were defrayed in part by the payment of page
charges. This article must therefore be hereby marked advertisement in accordance
with 18 U.S.C. Section 1734 solely to indicate this fact.
1
This work was supported by National Institutes of Health Grants AI42732 (to
H.W.S.), AI48115 (to H.W.S.), and HD043327 (to R.L.S.), and P.E. Kempkes-Stiftung (to M.Z.).
2
I.I.I. and R.L.S. contributed equally to the preparation of this manuscript.
3
Address correspondence and reprint requests to Dr. Harry W. Schroeder, Jr., WTI
378, 1530 3rd Avenue South, Birmingham, AL 35294-3300. E-mail address:
[email protected]
4
Abbreviation used in this paper: FR, framework region.
Copyright © 2005 by The American Association of Immunologists, Inc.
are thus initially restricted to germline sequence, whereas CDR-L3
and -H3 are created de novo by VL3 JL and VH3 DH3 JH joining, respectively. The inclusion of a D gene segment and the addition of nongermline-encoded nucleotides (N regions) vastly enhance the potential for both combinatorial and somatic diversity of
CDR-H3. Enhanced diversity and a central position within the Ag
binding site allow CDR-H3 to often play a critical role in the
recognition of Ag (7, 8).
The composition of the functional CDR-H3 repertoire is biased
in length, amino acid composition, predicted loop and base structure, and charge (9). The distribution of lengths of both murine and
human CDR-H3 forms normal Gaussian curves with differing
means, suggesting that each species achieves its own preferred
CDR-H3 length (9, 10). The average hydrophobicity of the amino
acids within the CDR-H3 loop also forms a Gaussian distribution
centering on neutrality to mild hydrophilicity (11). This neutral,
hydrophilic preference reflects enrichment for tyrosine and glycine
residues in the CDR-H3 loop in excess of that which would be
predicted by random chance alone (9, 11, 12).
Construction of CDR-H3 begins early in B cell progenitors. The
various defined stages of B cell development can be viewed, in
part, as transitions through a series of checkpoints that test the
assembly and function of the V domains (13–15). A number of
studies have established that the CDR-H3 plays a crucial role in
these selection processes (16 –18). In humans, repertoire selection
during B cell development is associated with a reduction in the
distribution and mean length of the expressed CDR-H3 repertoire
(19), and loss of highly charged or hydrophobic sequences (20 –
22). It has been proposed that the loss of longer sequences as well
as those that are enriched for charged amino acids reflects a higher
likelihood of self-reactivity in the Igs that bear them (22).
To gain insight into the mechanisms used to regulate the Ab
repertoire, to determine when during development constraints on
0022-1767/05/$02.00
Downloaded from http://www.jimmunol.org/ by guest on June 12, 2017
To gain insight into the mechanisms that regulate the development of the H chain CDR3 (CDR-H3), we used the scheme of Hardy
to sort mouse bone marrow B lineage cells into progenitor, immature, and mature B cell fractions, and then performed sequence
analysis on VH7183-containing C␮ transcripts. The essential architecture of the CDR-H3 repertoire observed in the mature B cell
fraction F was already established in the early pre-B cell fraction C. These architectural features include VH gene segment use
preference, DH family usage, JH rank order, predicted structures of the CDR-H3 base and loop, and the amino acid composition
and average hydrophobicity of the CDR-H3 loop. With development, the repertoire was focused by eliminating outliers to what
appears to be a preferred repertoire in terms of length, amino acid composition, and average hydrophobicity. Unlike humans, the
average length of CDR-H3 increased during development. The majority of this increase came from enhanced preservation of JH
sequence. This was associated with an increase in the prevalence of tyrosine. With an accompanying increase in glycine, a shift
in hydrophobicity was observed in the CDR-H3 loop from near neutral in fraction C (ⴚ0.08 ⴞ 0.03) to mild hydrophilic in fraction
F (ⴚ0.17 ⴞ 0.02). Fundamental constraints on the sequence and structure of CDR-H3 are thus established before surface IgM
expression. The Journal of Immunology, 2005, 174: 7773–7780.
7774
Materials and Methods
Mice
CD43 (FITC) (BD Harlingen), anti-IgD (PE) (Southern Biotechnology
Associates), and anti-IgM (Cy-5).
Sorting, RNA preparation, RT-PCR, and sequencing
From each bone marrow fraction, 2 ⫻ 104 cells were sorted directly into
RLT lysing buffer (Qiagen RNeasy minikit). Total RNA was prepared per
the manufacturer’s protocol. RNA was eluted in 30 ␮l of water. Subsequently, 10 ␮l of the RNA was used to generate avian myeloblastosis virus
reverse transcriptase (Roche Molecular Biochemicals) catalyzed firststrand cDNA with the primer C␮1 (5⬘-GACAGGGGGCTCTCG-3⬘). The
cDNA was diluted 4-fold and used as a template to amplify VH7183DJC␮
joins by PCR (Qiagen Taq Core PCR kit). PCR conditions were as follows:
hot start at 95°C for 3 min; 35 cycles of 1 min at 94°C, 1 min at 60°C, and
1 min at 72°C; and a final extension at 72°C for 10 min. The primers were
a FR1 VH7183-specific upstream primer, AF303 (5⬘-GGGGCTCGAG
GAGTCTGGGGGA-3⬘) (23), and the C␮ exon 1 primer C␮2 (5⬘-CAG
GATCCGAGGGGGAAGACATTTGG-3⬘). PCR products were subcloned
(TOPO-TA Cloning kit; Invitrogen Life Technologies), primed with the
AF303 primer, and sequenced at the University of Alabama at Birmingham
Heflin Center Sequencing Core. Sequences have been submitted to GenBank under the identifiers AY895210-AY895829.
Sequence analysis
CDR-H3 was identified as the region between (but not including) the 3⬘
VH-encoded conserved cysteine (TGT) at Kabat position 92 (IMGT 104)
and the 5⬘ JH-encoded conserved tryptophan (TGG) at Kabat position 103
(IMGT 118) (7, 28). CDR-H3 was separated into two components, the base
(Kabat aa 93 and 94 (IMGT 105 and 106), typically alanine and arginine;
and Kabat aa 100 –102 (IMGT 115–117), typically phenylalanine or methionine, aspartic acid or alanine, and tyrosine or valine) and the loop (the
intervening amino acids) (Fig. 1).
We sorted bone marrow B cell fractions from four separate mice on three
separate occasions. The mice analyzed represent the progeny of mice on a
mixed 129/C57BL6 background that had been backcrossed for 10 generations onto BALB/cJ (The Jackson Laboratory; stock 000651). All studies
were performed in accordance with Institutional Animal Care and Use
Committee regulations.
Statistical analysis
Flow cytometry and cell sorting
Results
Single cell suspensions were prepared by flushing the bone marrow from
two femurs with FACS buffer (1⫻ PBS ⫹ 2% heat-inactivated FCS). RBC
were lysed in 1 ml of erythrocyte lysing buffer (0.15 M NH4Cl, 1 mM
KHCO3, 0.1 mM EDTA; Comprehensive Cancer Center Media Prep; catalogue ACK) for 5 min at room temperature. Cells were washed and resuspended in an appropriate volume of FACS buffer for counting and staining. The total bone marrow from individual mice was incubated in 500 ␮l
of fluorescently labeled Abs in FACS buffer. Sorting was then performed
on a MoFlo instrument (DakoCytomation).
The following sets of mAbs were used: for Hardy fractions B and C,
anti-CD19 (streptavidin SpectralRed) (Southern Biotechnology Associates), anti-CD43 (FITC) (BD Harlingen), anti-BP-1 (PE) (gift from J. Kearney, University of Alabama at Birmingham, Birmingham, AL), and antiIgM (Cy5) (Jackson ImmunoResearch Laboratories); for fraction D-F, antiCD19 (streptavidin SpectralRed) (Southern Biotechnology Associates) anti-
Differences between populations were assessed, where appropriate, by Student’s t test, two tailed; Fisher’s exact test, two tailed; ␹2; or the Levene test
for the homogeneity of variance. Analysis was performed with JMP IN
version 5.1 (SAS Institute). Means are accompanied by the SEM.
Cells within the Hardy bone marrow fractions B-F were sorted
using the gates shown in Fig. 2. A total of 707 transcripts was
sequenced, of which 649 (92%) were unique. Of these, 619 (95%)
contained in-frame, open rearrangements. By fraction, there were
66 sequences from B (pro-B), 192 sequences from C (early pre-B),
131 sequences from D (late pre-B), 121 sequences from E (immature B), and 109 sequences from F (mature B).
Preferential use of VH7183.10 is established early in B cell
development
In accordance with previous studies by other investigators (23, 25,
26, 29), the prevalence of VH7183.1 (VH81X) declined with development (B3 F; p ⬍ 0.001). VH81X represented 38% of the
FIGURE 1. Analysis of CDR-H3. In this hypothetical sequence, the location of CDR-H3, the CDR-H3 loop, and frameworks 3 and 4 is shown. Each
VDJC␮ sequence was deconstructed and evaluated for VH, DH, and JH usage; P junctions and N addition; the length of CDR-H3 in codons; and the
distribution of individual amino acids and average Kyte-Doolittle hydrophobicity (31, 32) in the CDR-H3 loop. Kabat and IMGT (7, 28) number
designations for the TGT codon, which marks the terminus of framework 3, and the TGG, which marks the beginning of framework 4, are identified. Amino
acids at the extreme (arginine and isoleucine) have been included to demonstrate the range of the hydrophobicity index. A single palindromic nucleotide
(T) flanks VH sequence, and DFL16.1 sequence is accompanied by 3 nt of N addition on each flank. The normalized average hydrophobicity of the CDR-H3
loop is ⫺0.24.
Downloaded from http://www.jimmunol.org/ by guest on June 12, 2017
CDR-H3 composition are imposed, and to establish the extent to
which murine development resembles that of humans, we sought
to establish the pattern of CDR-H3 repertoire development in mice
bearing an IgMa H chain repertoire. We used the scheme of Hardy
(14) to sort bone marrow B lineage cells into progenitor, immature,
and mature B cell fractions. We then cloned, sequenced, and deconstructed the CDR-H3 component of VH7183DJC␮ transcripts.
We chose to look at RNA message, as this is most representative
of the expressed, and thus functional, Ig repertoire. We focused on
the VH7183 family because its germline complement in IgHa alleles has been well defined (23); it represents a manageable 10% of
the active repertoire (24); patterns of VH7183 use during ontogeny
and development have been well established (23, 25, 26); and it
contributes to both self and nonself reactivities (reviewed in
Ref. 27).
We show in this study that the essential architecture of the
CDR-H3 repertoire, including patterns of gene segment use, amino
acid composition, charge, predicted base and loop structure, and
average length, is established very early in B cell development,
well before the expression of surface IgM. Development appears to
focus the repertoire by eliminating outliers to what appears to be a
preferred repertoire in terms of length, amino acid composition,
and average hydrophobicity.
CDR-H3 REPERTOIRE DEVELOPMENT IN BONE MARROW
The Journal of Immunology
7775
Patterns of DH use remain relatively unchanged with
development
Use of the various DH families did not undergo a significant
change with development (Fig. 3B). Using a minimum of 5 nt of
identity to assign germline DH origin, we identified members of the
DSP and DFL families in ⬃50 and ⬃30% of the transcripts, respectively. DQ52 was used in 4 –10% of the transcripts, and DST4
contributed to ⬍2% of the sequences. Due to exonucleolytic nibbling and N addition, we were unable to identify a DH progenitor
in the remaining transcripts. The DFL16.1 gene segment was the
single most commonly used DH gene segment at all stages of development, representing ⬃20% of sequences in all of the fractions.
Increased prevalence of reading frame 2 in fraction B sequences
fraction B sequences, 23% of the fraction C sequences (B3 C; p ⫽
0.03), 7% of the fraction D sequences (C3 D; p ⬍ 0.001), 10% of
the fraction E sequences (D3 E; p ⫽ 0.52), and 2% of the fraction
F sequences (E3 F; p ⫽ 0.02) (Fig. 3A).
VH7183.10 was the most commonly used VH7183 gene segment
in fractions C, D, E, and F. VH7183.10 increased from 5% in
fraction B to 21% in fraction C ( p ⬍ 0.01), and then remained
relatively unchanged in fractions D, E, and F (31, 20, and 22%;
p ⫽ 1.0). Changes in the prevalence of VH gene segments other
than VH7183.10 were also observed, but none of these individual
changes achieved statistical significance.
FIGURE 3. Use of VH, DH, and
JH gene segments and DH reading
frame as a function of B cell development through Hardy fractions B
through F (Fig. 1). A, Distribution of
VH7183 family gene segment usage.
The VH segments are arranged in
germline order, with the most DH
proximal sequences to the right. VH,
DH, and JH use is reported as the percentage of the sequenced population
of unique, in-frame, open transcripts
from each B lineage fraction. B, Distribution of DH families. C, DH reading frame use in CDR-H3 intervals
containing DFL or DSP family gene
segments. Use is reported as the percentage of the sequenced population
of DFL- and DSP-containing transcripts from each B lineage fraction.
D, Distribution of JH gene segments.
Downloaded from http://www.jimmunol.org/ by guest on June 12, 2017
FIGURE 2. Representative gates used to identify and sort Hardy fractions B through F from bone marrow.
A shift in reading frame prevalence was observed during B cell
development (Fig. 3C). We identified 53 fraction B, 152 fraction
C, 88 fraction D, 92 fraction E, and 88 fraction F sequences that
contained identifiable DFL or DSP gene segments. Reading frame
1 was the predominant reading frame at all stages of B cell development. However, use of reading frame 1 increased from 57% in
fraction B to 70% in fraction C, 68% in fraction D, 78% in fraction
E, and 78% in fraction F. Use of reading frame 3 decreased from
17% in fraction B to 12% in fraction C, 18% in fraction D, 11% in
fraction E, and 9% in fraction F. Use of reading frame 2 began at
26% in fraction B, and then decreased to 18% in fraction C, 14%
in fraction D, 11% in fraction E, and 13% in fraction F. The change
in distribution of reading frames between fractions B and F was
significant at p ⫽ 0.02.
Reading frame 3 typically encodes one or more termination
codons. Functional sequences containing RF3 were significantly
shorter (11.7 ⫾ 0.3 codons) than those using RF1 (12.7 ⫾ 0.1; p ⫽
0.004) and RF2 (12.5 ⫾ 0.3; p ⫽ 0.05). No significant differences
were observed in the average length of RF1- and RF2-containing
sequences ( p ⫽ 0.60).
7776
Increased use of JH1 in the transition to the immature B cell
stage
A shift in rank order of JH use was observed in the fraction B3 C
transition (Fig. 3D). In fraction B, JH2 (31%) and JH3 (31%) were
the most frequently used JH, followed by JH4 (27%) and JH1 (4%).
In fractions C through F, JH4 was the most commonly used sequence (35– 40%), followed by relatively equivalent use of JH3
(26 –27%) and JH2 (21–28%), and then JH1 (7–16%). The rise in
the use of JH1 from fractions B (4%), through C (7%), to D (16%)
reached statistical significance ( p ⬍ 0.03; ␹2). Use of JH1 then
remained relatively stable (14 and 12% in fractions E and F,
respectively).
CDR-H3 REPERTOIRE DEVELOPMENT IN BONE MARROW
containing sequences were observed. By fraction F, the differences
in DFL16.1- and DSP-containing sequences no longer achieved
statistical significance (13.6 ⫾ 0.6 vs 12.9 ⫾ 0.2, respectively; p ⫽
0.37). However, DFL16.1 sequences retained a length advantage
over those with DQ52 (11.0 ⫾ 0.5; p ⫽ 0.01).
The increase in length from fraction B to fraction F reflected, in
part, a reduction in the prevalence of sequences whose CDR-H3
length was ⬍9 aa (Fig. 5A). Due to the larger number of sequences, this was best observed in a comparison between fractions
C and F. Of the 192 sequences in fraction C, 24 were 8 aa or less,
whereas only 3 of 109 sequences were 8 aa or less in fraction F
( p ⬍ 0.01). This also led to a significant narrowing in the variance
of the distribution of lengths ( p ⫽ 0.01; Levene).
An increase in average CDR-H3 length with development
The increase in CDR-H3 length reflected increased preservation
of terminal JH sequence
Sequences containing identifiable DH gene segments were deconstructed to assess the contribution of VH, DH, and JH sequence, and
of N addition and P junctions to the change in CDR-H3 length
(Fig. 6). From fractions B to F, the average length increased by 3.4
nt ( p ⫽ 0.007), or 1.1 codons. Minor increases in the contribution
of the VH sequence (⫹0.2 nt) and N addition at the 5⬘ and 3⬘
junctions (⫹0.4 nt each), which reflected one-third of the increase,
were observed. However, none of these rather subtle increases in
length achieved statistical significance. In contrast, the contribution of JH germline sequence increased by 2.6 nt, or two-thirds of
the total increase. On average, JH sequence contributed 10.7 ⫾ 0.6
nt in fraction B and 13.3 ⫾ 0.4 nt in fraction F ( p ⬍ 0.001;
Student’s t test).
The increase in average JH component length reflected both the
increased use of JH1 and JH4 and enhanced preservation of 5⬘
terminal nucleotides among sequences that used JH2 or JH3. The
four JH sequences differ in length, with JH2 and JH3 contributing
up to 14 nt each to CDR-H3, JH1 19 nt, and JH4 20 nt (Fig. 6). We
examined the complete database of 619 unique sequences for the
contribution of JH. The average length of the 22 sequences that
contain JH1 and JH4 in fraction B was 12.8 ⫾ 0.6 vs 12.9 ⫾ 0.4 in
51 sequences from fraction F ( p ⫽ 0.82). In these CDR-H3 intervals, JH1 and JH4 contributed 14.2 ⫾ 0.9 nt in fraction B vs 15.4 ⫾
0.6 nt in fraction F ( p ⫽ 0.25; Student’s t test). The average length
of the 44 sequences containing JH2 and JH3 in fractions B was
10.7 ⫾ 0.4 vs 12.1 ⫾ 0.3 in 58 sequences from fraction F ( p ⫽
FIGURE 4. Change in average CDR-H3 length and hydrophobicity in VH7183DJC␮ transcripts isolated from Hardy fractions B through F (14) (Fig.
1). Error bars depict the SE of each mean. A, Change in the average length of CDR-H3 intervals containing DFL16.1, DSP family gene segments, DQ52,
or no identifiable D compared with the change in the population as a whole. Sequences containing DFL16.2 or DST4 are not included due to paucity of
numbers. B, Change in the average hydrophobicity index value of the CDR-H3 loops.
Downloaded from http://www.jimmunol.org/ by guest on June 12, 2017
The average length of CDR-H3 increased during development
from an average of 11.4 ⫾ 0.3 in fraction B to 12.5 ⫾ 0.2 in
fraction F ( p ⫽ 0.01) (Fig. 4A). Mouse DH sequences differ in
length. To assess the contribution of the identity and length of the
DH on the length of CDR-H3, we compared the average lengths of
sequences that contained DFL16.1 with those with DSP gene family members, DQ52, or no identifiable D gene segment, respectively (Fig. 4A). DFL16.1 contains 23 nt, DFL16.2 and the DSP
gene segments are two codons shorter with 17 nt, DST4 contains
16 nt, and DQ52 is 4 codons shorter with only 11 nt. For sequences
containing DFL16.1, the average length increased from 12.6 ⫾ 0.6
in fraction B to 13.6 ⫾ 0.6 in fraction F ( p ⫽ 0.29). For sequences
containing DSP family members, the length increased from 11.7 ⫾
0.5 in fraction B to 12.9 ⫾ 0.2 in fraction F ( p ⫽ 0.01; Student’s
t test). For sequences containing DQ52, the length increased from
9.0 ⫾ 0.7 to 11.0 ⫾ 0.5 ( p ⫽ 0.05).
DFL16.1-containing sequences remained statistically similar in
length distribution throughout development (Fig. 4A). With development, DSP-containing sequences converged in length to the
DFL16.1 standard, whereas DQ52-containing sequences retained a
length disadvantage. In fraction B, sequences containing DFL16.1
were ⬃0.9 codons longer than those that contained DSP gene segments ( p ⫽ 0.31) and 3.5 codons longer than those that contained
DQ52 ( p ⫽ 0.002). In fraction C, the length distribution of
DFL16.1- and DSP-containing sequences significantly diverged
(13.1 ⫾ 0.4 vs 11.8 ⫾ 0.3, respectively, p ⫽ 0.001). In fractions
D and E, DSP-containing sequences increased in length to 12.5 ⫾
0.3 codons, while no significant changes in the length of DFL16.1-
The Journal of Immunology
7777
0.001). In these CDR-H3 intervals, JH2 and JH3 contributed 9.2 ⫾
0.6 nt in fraction B vs 10.9 ⫾ 0.3 nt in fraction F ( p ⫽ 0.0003).
The convergence in length distribution between DFL16.1- and
DSP-containing sequences reflected a balance between an increase
in the contribution of JH and loss of 5⬘ terminal DFL16.1 sequence.
At the point of greatest divergence in fraction C, the 44 sequences
that contain DFL16.1 lost an average of 3.5 ⫾ 0.4 5⬘ terminal
nucleotides vs 4.7 ⫾ 0.3 for the 94 sequences that contain DSP
FIGURE 6. Deconstruction of the
components contributing to CDR-H3
length in sequences containing identifiable DH gene segments (DFL,
DSP, DST, and DQ52). The potential
contribution of the germline sequence of the VH gene segment, P
junctions, N region addition, the DH
gene segment, and the JH gene segment to CDR-H3 length is illustrated.
The length of DFL16.2 is identical
with that of DSP family members.
The actual contribution of these components to 57 fraction B, 164 fraction
C, 101 fraction D, 98 fraction E, and
96 fraction F sequences is shown. All
components are shown to scale. The
major contributor to the increase in
average CDR-H3 length with development is observed to be an increase
in the contribution of JH sequence
(p ⫽ 0.0007).
Downloaded from http://www.jimmunol.org/ by guest on June 12, 2017
FIGURE 5. Distribution of CDR-H3
lengths and hydrophobicity in
VH7183DJC␮ transcripts isolated
from Hardy fractions B through F (14)
(Fig. 1). Prevalence is reported as the
percentage of the sequenced population of unique, in-frame, open transcripts from each B lineage fraction.
To facilitate visualization of the
change in variance of the distribution,
the lines mark the preferred range of
lengths or average hydrophobicity observed in fraction F. To facilitate visualization of the change in the mean of
the lengths or average hydrophobicity,
arrows have been placed to mark the
position of the average length or average hydrophobicity in fraction F. A,
Distribution of CDR-H3 lengths as a
function of B cell development. B,
Distribution of the average hydrophobicity of individual CDR-H3 loops as
assessed by reference to a normalized
Kyte-Doolittle scale (31, 32).
7778
CDR-H3 REPERTOIRE DEVELOPMENT IN BONE MARROW
FIGURE 7. Distribution of amino acids in the CDR-H3
loops of the VH7183DJC␮ transcripts isolated from Hardy
fractions B through F (14) (Fig. 1). The amino acids are
arranged by relative hydrophobicity, as assessed by a normalized Kyte-Doolittle scale (31, 32). Use is reported as the
percentage of the sequenced population from each B lineage
fraction.
Increased prevalence of tyrosine and glycine in fraction F
The general bias for tyrosine and glycine in the CDR-H3 loop was
first apparent in fraction B and intensified during B cell development (Fig. 7). Of the 423 predicted aa in the CDR-H3 loops from
fraction B, 135 (32%) were either tyrosine or glycine, whereas of
the 814 predicted aa in the loops from fraction F, 324 (40%) were
tyrosine or glycine ( p ⫽ 0.025). Overall, the amino acid composition differed significantly between fraction B and fraction F ( p ⫽
0.01, ␹2, 19 degrees of freedom). Fraction B loops, for example,
contained more hydrophobic amino acids (11% valine, isoleucine,
or leucine) than fraction F (9%). However, these and other changes
in the prevalence of individual amino acids did not achieve statistical significance.
Shifts in the prevalence of tyrosine and glycine, in length, and in
N addition have been associated with changes in the distribution of
the predicted structures of the CDR-H3 loop and base (9, 30).
These types of changes have been observed as a function of ontogeny as well as of species origin. However, in the adult mouse
sequences analyzed in this work, the stability in the relative prevalence of amino acid sequence, length, and N addition was accompanied by stability in the distribution of predicted base and loop
structures (data not shown). No significant changes in predicted
structure were observed from fractions B to F.
A shift in average hydrophobicity from near neutrality to
hydrophilicity
To determine whether there was a global change in the distribution
of hydrophobicity with development, we used a normalized KyteDoolittle scale (31, 32) to calculate the relative average hydrophobicity of the CDR-H3 loops (Fig. 4B). We observed a shift to
neutrality from fraction B (⫺0.15 ⫾ 0.04) to fraction C (⫺0.08 ⫾
0.03) that was followed by shift toward hydrophilicity in fractions
D, E, and F (⫺0.15 ⫾ 0.03, ⫺0.14 ⫾ 0.03, and ⫺0.17 ⫾ 0.02,
respectively). The shift in average hydrophobicity from fraction C
to D is significant at p ⫽ 0.05, and the shift from fraction C to F
is significant at p ⫽ 0.02.
As in the case of length, the variance in average hydrophobicity
decreased with development. This shift in variance was significant
at p ⫽ 0.03 between fractions B and F, and at p ⬍ 0.01 between
fractions C and F. The change in variance is due, in part, to the loss
of sequences at the extremes (Fig. 5B). In fraction B, there was one
sequence whose average hydrophobicity score was greater than
0.6, there were 13 in fraction C, 3 in fraction D, 4 in fraction E, and
none in fraction F. Similarly, there were 3 sequences in fraction B
with an average hydrophobicity score of less than ⫺0.6, there were
6 in fraction C, 6 in fraction D, two in fraction E, and none in
fraction F. The difference in the prevalence of sequences at the
extreme in fraction C (19 of 211) vs fraction F (0 of 110) is significant at p ⬍ 0.01. A comparison of highly charged sequences
between surface IgM⫺ pre/pro-B cells (fractions B, C, and D) and
surface IgM⫹ B cells (fractions E and F) is also significant at
p ⬍ 0.05.
Discussion
We have shown in this study that the major patterns of VH, DH,
and JH use in the expressed repertoire are already established in
progenitor B cells, and thus before the expression of membranebound IgM. The relative prevalence of the various DH families in
fraction B remained relatively unchanged from fraction B through
fraction F. The rank order of JH prevalence that was first established in fraction C was maintained through fraction F. Consistent
with previous reports that focused on rearrangement preference
(23, 25, 26, 29), we found a high frequency of transcripts using the
VH81X (VH7183.1) gene segment in fraction B. The prevalence of
VH81X then steadily diminished in the progression from fraction C
to fraction F. A preference for VH7183.10 was established in fraction C, and its prevalence remained remarkably stable from fraction C to fraction F. Thus, the dominant pattern of VH, DH, and JH
use remained essentially unchanged from fraction C to F.
Fraction C includes cells that are still at the intermediate DJ
rearrangement stage, early pre-B cells that have rearranged their
VDJ locus, and pre-B cells that express the pre-BCR (15, 33).
Active translation of mRNA is associated with increased transcript
abundance due to stabilization of the mRNA by polysomes, a process that is enhanced by B cell activation (34 –36). If the successful
assembly of the pre-BCR activates early pre-B cells, it is possible
that assembly may similarly enhance ␮ mRNA abundance and
thereby stabilize VH and JH preference. Testing of this hypothesis
will require detailed sequence analysis of transcripts and mRNA
message abundance from fraction C cells that have been separated
on the basis of pre-BCR expression.
Our work confirms and extends a previous observation regarding the enhanced use of DH reading frame 2 at the earliest stages
of B cell development (25). The primary mechanism postulated to
limit use of reading frame 2 in the expressed repertoire is of the
ability of D␮ protein to create a pre-BCR complex (37, 38), which
can then activate the allelic exclusion signal transduction pathway
to prevent further V3 DJ rearrangement. We speculate that all of
the components of this pathway may not be fully active in fraction
B, allowing reading frame 2 DJ rearrangements to undergo V3 DJ
recombination. Rearrangement at stages lacking an intact pre-BCR
signal transduction complex may represent a mechanism by which
Downloaded from http://www.jimmunol.org/ by guest on June 12, 2017
( p ⫽ 0.04), giving DFL16.1 an average net gain of 1.2 germline nt
relative to DSP. By fraction F, 5⬘ terminal loss among the 20
sequences that contain DFL16.1 had increased to an average of
5.5 ⫾ 0.6 nt vs 4.1 ⫾ 0.4 for the 57 sequences that contain DSP
( p ⫽ 0.05), yielding an average net loss of 1.4 nt relative to DSP.
This is a net flip of 2.6 nt, or almost 1 codon.
The Journal of Immunology
containing charged amino acids have been reported to be sequentially purged from the repertoire (22). As with length and amino
acid use, we observed an adjustment and focusing of the charge
distribution of CDR-H3 during B cell development. This reflected
not only a decrease in the average charge, but also a significant
reduction in the contribution of highly charged or highly hydrophobic sequences.
Together, these data show that the essential architecture of the
CDR-H3 repertoire, including gene segment use, length, structure,
amino acid composition, and average hydrophobicity, is established early in B cell development before the surface expression of
membrane-bound IgM. With development, many of these features
are fine-tuned and focused into an apparent optimal range of
lengths, charge, and amino acid composition.
These observations raise a series of open questions. DH reading
frame 1 sequences encode the same amino acids that dominate
CDR-H3 throughout bone marrow development. Does germline
conservation of DH sequence serve as the deciding factor in dictating the composition of CDR-H3? If germline DH sequence plays
a critical role in regulating CDR-H3, what role do individual DH
sequences play in the development of the repertoire and the ability
to mount humoral immune responses? What are the specific selective pressures and the mechanisms that serve to fine-tune the
composition of CDR-H3 as the B cell population matures? And
finally, what might be the functional consequences of violating
these apparent constraints on CDR-H3 composition? The answer
to these questions will require targeted manipulation of the germline DH locus, which may shed new light on the role of the DH
gene segment in navigating B cell developmental checkpoints and
optimizing immune function.
Acknowledgments
We thank P. Burrows, T. Carvalho, G. Ippolito, and J. Link for their invaluable advice and support.
Disclosures
The authors have no financial conflict of interest.
References
1. Tonegawa, S. 1983. Somatic generation of antibody diversity. Nature 302: 575–581.
2. Alt, F. W., and D. Baltimore. 1982. Joining of immunoglobulin heavy chain gene
segments: implications from a chromosome with evidence of three D-J heavy
fusions. Proc. Natl. Acad. Sci. USA 79: 4118 – 4122.
3. Rajewsky, K. 1996. Clonal selection and learning in the antibody system. Nature
381: 751–758.
4. Hood, L., and D. Galas. 2003. The digital code of DNA. Nature 421: 444 – 448.
5. Nossal, G. J. V. 2003. The double helix and immunology. Nature 421: 440 – 444.
6. Janeway, C. A., Jr., and R. Medzhitov. 2000. Innate immune recognition. Annu.
Rev. Immunol. 20: 197–216.
7. Kabat, E. A., T. T. Wu, H. M. Perry, K. S. Gottesman, and C. Foeller. Sequences
of Proteins of Immunological Interest. U.S. Department of Health and Human
Services, Bethesda, pp. 1–2387.
8. Padlan, E. A. 1994. Anatomy of the antibody molecule. Mol. Immunol. 31:
169 –217.
9. Zemlin, M., M. Klinger, J. Link, C. Zemlin, K. Bauer, J. A. Engler,
H. W. Schroeder, Jr., and P. M. Kirkham. 2003. Expressed murine and human
CDR-H3 intervals of equal length exhibit distinct repertoires that differ in their
amino acid composition and predicted range of structures. J. Mol. Biol. 334:
733–749.
10. Wu, T. T., G. Johnson, and E. A. Kabat. 1993. Length distribution of CDRH3 in
antibodies. Proteins Struct. Funct. Genet. 16: 1–7.
11. Ivanov, I. I., J. M. Link, G. C. Ippolito, and H. W. Schroeder, Jr. 2002. Constraints on hydropathicity and sequence composition of HCDR3 are conserved
across evolution. In The Antibodies, Vol. 7. M. Zanetti and J. D. Capra, eds.
Taylor and Francis Group, London, pp. 43– 67.
12. Golub, R., J. S. Fellah, and J. Charlemagne. 1997. Structure and diversity of the
heavy chain VDJ junctions in the developing Mexican axolotl. Immunogenetics
46: 402– 409.
13. Meffre, E., R. Casellas, and M. C. Nussenzweig. 2000. Antibody regulation of B
cell development. Nat. Immunol. 1: 379 –385.
14. Hardy, R. R., and K. Hayakawa. 2001. B cell development pathways. Annu. Rev.
Immunol. 19: 595– 621.
Downloaded from http://www.jimmunol.org/ by guest on June 12, 2017
suboptimal H chain V domains may enter the immature B cell
repertoire.
Unlike humans, in which the average length of CDR-H3 decreases with development (37, 38), we observed an increase in the
average length of CDR-H3 with development in mice. On average,
however, mouse sequences are significantly shorter than human
ones (9). Thus, even in fraction F, the mice lacked the longer
CDR-H3 sequences that have been associated with enhanced selfreactivity in humans.
As with humans (19), the changes in sequence that contribute to
the increase in length can be subtle. Although the only significant
changes affecting overall CDR-H3 length were the enhanced retention of 5⬘ JH terminal sequence and the increased loss of 5⬘
DFL16.1 sequence, given the number of sequences analyzed it is
possible that our analysis was underpowered to confirm the statistical significance of other more delicate differences. Adjustments
to JH and DH length with development and with ontogeny are also
observed in humans, macaques, and chimpanzees (19, 30). They
appear to represent common mechanisms used to respond to potentially common selective forces.
The increase in average length of CDR-H3 with development
occurred in association with decreased representation of outlier
lengths. Although our analysis cannot distinguish whether this
phenomenon results from the loss of outliers or positive selection
for sequences with lengths closer to the final average length of
CDR-H3 in the mature B cell fraction F, the net effect is to decrease the range of diversity, thereby focusing the mature B cell
repertoire into what appears to be a preferred range.
As with length, the broad outline of amino acid preference was
also established early in B lineage development. Preference for
tyrosine and glycine in the CDR-H3 loop was already established
in fraction B. With development, the representation of these two
amino acids was enhanced. The adjustments to CDR-H3 length,
the loss of 5⬘ terminal DFL16.1 and the gain of 5⬘ terminal JH
sequence appear to play a role in this process. The sequences of the
CDR-H3 loop portion of the four JH gene segments encode YWY,
Y, W, and YYA, respectively. Thus, tyrosine represents 50% of
the JH amino acids that can contribute to the CDR-H3 loop. Enhanced preservation of 5⬘ JH sequence will enrich for tyrosine at
the C terminus of the loop. In reading frame 1, the 5⬘ terminus of
DFL16.1 encodes tyrosine. The increased loss of 5⬘ nucleotides
with development has the effect of decreasing representation of
tyrosine at the amino terminus of the DFL16.1-containing loops. In
humans, there is an asymmetric distribution of tyrosine in the
CDR-H3 loop, with the C terminus enriched for tyrosine (9). The
tyrosine gradient increases with increasing loop length. A similar
tyrosine gradient is not obvious in mice. However, DFL16.1 gene
segment-containing sequences are among the longest CDR-H3
loops. The combination of loss of 5⬘ DFL16.1-encoded tyrosine
and gain of 3⬘ JH-encoded tyrosine may act to prevent a relative
enrichment for tyrosine at the amino terminus of the loop. An
excess of tyrosine at the beginning of the CDR-H3 loop may be as
undesirable in mice as in humans.
The average length of sequences that contain DQ52 converges
toward the average for sequences in which the contribution of DH
cannot be ascertained. Both of these types of short sequences are
less likely to contain tyrosine in the CDR-H3 loop, although glycine is common (data not shown). The role of short, glycine-enriched, tyrosine-depleted CDR-H3 structures is unknown, but they
appear clearly distinct from CDR-H3 structures that contain extensive sequence derived from the DFL and DSP gene segments,
which are longer and enriched for tyrosine as well as glycine.
The presence of excess charged amino acids has been correlated
with self-reactivity (22, 39 – 41), especially to dsDNA. Sequences
7779
7780
27. Kirkham, P. M., and H. W. Schroeder, Jr. 1994. Antibody structure and the
evolution of immunoglobulin V gene segments. Semin. Immunol. 6: 347–360.
28. Lefranc, M.-P. 2003. IMGT, the international ImMunoGeneTics database. Nucleic Acids Res. 31: 307–310.
29. Yancopoulos, G. D., S. V. Desiderio, M. Paskind, J. F. Kearney, D. Baltimore,
and F. W. Alt. 1984. Preferential utilization of the most JH-proximal VH gene
segments in pre-B cell lines. Nature 311: 727–733.
30. Link, J. M., J. E. Larson, and H. W. Schroeder, Jr. 2005. Despite extensive
similarity in germline DH and JH sequence, the adult Rhesus macaque CDR-H3
repertoire differs from human. Mol. Immunol. 42: 943–55.
31. Kyte, J., and R. F. Doolittle. 1982. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157: 105–132.
32. Eisenberg, D. 1984. Three-dimensional structure of membrane and surface proteins. Annu. Rev. Biochem. 53: 595– 623.
33. Martensson, I. L., A. Rolink, F. Melchers, C. Mundt, S. Licence, and T. Shimizu.
2002. The pre-B cell receptor and its role in proliferation and Ig heavy chain
allelic exclusion. Semin. Immunol. 14: 335–342.
34. Kelley, D. E., and R. P. Perry. 1986. Transcriptional and posttranscriptional control of immunoglobulin mRNA production during B lymphocyte development.
Nucleic Acids Res. 14: 5431–5447.
35. Jack, H. M., J. Berg, and M. Wabl. 1989. Translation affects immunoglobulin
mRNA stability. Eur. J. Immunol. 19: 843– 847.
36. Maquat, L. E. 2004. Nonsense-mediated mRNA decay: splicing, translation and
mRNP dynamics. Nat. Rev. Mol. Cell Biol. 5: 89 –99.
37. Gu, H., D. Kitamura, and K. Rajewsky. 1991. B cell development regulated by
gene rearrangement: arrest of maturation by membrane-bound D␮ protein and
selection of DH element reading frames. Cell 65: 47–54.
38. Horne, M. C., P. E. Roth, and A. L. DeFranco. 1996. Assembly of the truncated
immunoglobulin heavy chain D␮ into antigen receptor-like complexes in pre-B
cells but not in B cells. Immunity 4: 145–158.
39. Shlomchik, M. J., M. A. Mascelli, H. Shan, M. Z. Radic, D. S. Pisetsky,
A. Marshak-Rothstein, and M. G. Weigert. 1990. Anti-DNA antibodies from
autoimmune mice arise by clonal expansion and somatic mutation. J. Exp. Med.
171: 265–292.
40. Barbas, S. M., H. J. Ditzel, E. M. Salonen, W. P. Yang, G. J. Silverman, and
D. R. Burton. 1995. Human autoantibody recognition of DNA. Proc. Natl. Acad.
Sci. USA 92: 2529 –2533.
41. Zemlin, M., R. L. Schelonka, K. Bauer, and H. W. Schroeder, Jr. 2002. Regulation and chance in the ontogeny of B and T cell antigen receptor repertoires.
Immunol. Res. 26: 265–278.
Downloaded from http://www.jimmunol.org/ by guest on June 12, 2017
15. Burrows, P. D., R. P. Stephan, Y. H. Wang, K. Lassoued, Z. Zhang, and
M. D. Cooper. 2002. The transient expression of pre-B cell receptors governs B
cell development. Semin. Immunol. 14: 343–349.
16. Keyna, U., G. B. Beck-Engeser, J. Jongstra, S. E. Applequist, and H. M. Jack.
1995. Surrogate light chain-dependent selection of Ig heavy chain V regions.
J. Immunol. 155: 5536 –5542.
17. Kline, G. H., L. Hartwell, G. B. Beck-Engeser, U. Keyna, S. Zaharevitz,
N. R. Klinman, and H.-M. Jack. 1998. Pre-B cell receptor-mediated selection of
pre-B cells synthesizing functional ␮ heavy chains. J. Immunol. 161: 1608 –1618.
18. Martin, D. A., H. Bradl, T. J. Collins, E. Roth, H. M. Jack, and G. E. Wu. 2003.
Selection of Ig ␮ heavy chains by complementarity-determining region 3 length
and amino acid composition. J. Immunol. 171: 4663– 4671.
19. Shiokawa, S., F. Mortari, J. O. Lima, C. Nunez, F. E. I. Bertrand, P. M. Kirkham,
S. Zhu, A. P. Dasanayake, and H. W. Schroeder, Jr. 1999. IgM HCDR3 diversity
is constrained by genetic and somatic mechanisms until two months after birth.
J. Immunol. 162: 6060 – 6070.
20. Raaphorst, F. M., C. S. Raman, J. Tami, M. Fischbach, and I. Sanz. 1997. Human
Ig heavy chain CDR3 regions in adult bone marrow pre-B cells display an adult
phenotype of diversity: evidence for structural selection of DH amino acid sequences. Int. Immunol. 9: 1503–1515.
21. Schroeder, H. W., Jr., G. C. Ippolito, and S. Shiokawa. 1998. Regulation of the
antibody repertoire through control of HCDR3 diversity. Vaccine 16: 1383–1390.
22. Wardemann, H., S. Yurasov, A. Schaefer, J. W. Young, E. Meffre, and
M. C. Nussenzweig. 2003. Predominant autoantibody production by early human
B cell precursors. Science 301: 1374 –1377.
23. Williams, G. S., A. Martinez, A. Montalbano, A. Tang, A. Mauhar,
K. M. Ogwaro, D. Merz, C. Chevillard, R. Riblet, and A. J. Feeney. 2001. Unequal VH gene rearrangement frequency within the large VH7183 gene family is
not due to recombination signal sequence variation, and mapping of the genes
shows a bias of rearrangement based on chromosomal location. J. Immunol. 167:
257–263.
24. Viale, A. C., A. Coutinho, and A. A. Freitas. 1992. Differential expression of VH
gene families in peripheral B cell repertoires of newborn or adult immunoglobulin H chain congenic mice. J. Exp. Med. 175: 1449 –1456.
25. Huetz, F., L. Carlsson, U.-C. Tornberg, and D. Holmberg. 1993. V-region directed selection in differentiating B lymphocytes. EMBO J. 12: 1819 –1826.
26. Marshall, A. J., G. E. Wu, and G. J. Paige. 1996. Frequency of VH81x usage
during B cell development: initial decline in usage is independent of Ig heavy
chain cell surface expression. J. Immunol. 156: 2077–2084.
CDR-H3 REPERTOIRE DEVELOPMENT IN BONE MARROW