* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Genomic Structure of the Human IgX1 Gene Suggests That It May
Zinc finger nuclease wikipedia , lookup
Copy-number variation wikipedia , lookup
Transposable element wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
Genomic imprinting wikipedia , lookup
Genetic engineering wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Secreted frizzled-related protein 1 wikipedia , lookup
Expression vector wikipedia , lookup
Gene desert wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Gene therapy wikipedia , lookup
Gene nomenclature wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Community fingerprinting wikipedia , lookup
Point mutation wikipedia , lookup
Gene expression wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene expression profiling wikipedia , lookup
Gene regulatory network wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Genomic Structure of the Human IgX1 Gene Suggests That It May Be Expressed as an IgX14.1-like Protein or as a Canonical B Cell IgX Light Chain : Implications for IgX Gene Evolution By Robert J. Evans and Gregory F. Hollis From the Department of Biological Sciences, Monsanto Company, Saint Louis, Missouri 63198 Summary In pre-B cells, immunoglobulin ju (Igh) is associated with pre-B cell-specific proteins to form a multimeric complex that is found on the cell surface. One of these proteins is encoded by the three exon IgX-like gene 14 .1, whose expression is restricted to pre-B cells and occurs from an unrearranged gene . A comparison of the 14 .1 gene structure to the seven-gene human IgX locus revealed that the most 5' gene, IgX1, is organized in a three-exon structure very similar to the 14 .1 gene. Transcription and splicing of these three-exon sequences would lead to an mRNA with an open reading frame which could encode a light (L) chain-like protein with a molecular weight of 23,045 . Our analysis suggests that two transcripts may be produced from the IgX1 gene that share the same IgX1 constant region-containing third exon . One transcript would include all three 14 .1-related exons and be expressed from the germline gene, and the second transcript would be produced after variablejoining (VJ) recombination has occurred to IgXJ1 and would encode a classic IgX L chain protein. The conservation of the genomic organization of the human 14 .1 and IgX1 genes and the mouse homolog, X5, relative to the classic IgX L chain genes provides insight into the evolution of Ig genes. T he development of B cells from stem cells to mature Ig-secreting B cells can be classified by the rearrangement and expression of Ig H and L chain genes . In this progression, Igiz H chain undergoes VDJ rearrangement, is expressed first, and in part defines the pre-B cell stage of development . Subsequently, the Ig L chain genes, K and X, undergo rearrangement and expression . Expression of both Igp, H and L chain proteins leads to the formation of the Ig tetramer /A2L2 (L : tc or X L chain) and marks the transition of pre-B cells to the B cell stage of development (1-6) . During our studies of the human IgX locus, we identified and isolated two clones, 14 .1 and 16 .1, based on their shared homology with the IgX C region (7, 8). More recently, we have shown that at least one of these clones, 14 .1, is expressed as a 1-kb transcript exclusively in pre-B cells (9) . This transcript contains both IgX J and C region homologies as well as 5' sequence that is not derived from an Ig V region. Sequence analysis of a cDNA clone, Hom-1, derived from the 14 .1 gene revealed a long open reading frame capable of encoding a protein ofan unprocessed mol wt of 22,944 . Polyclonal antisera, generated to a peptide predicted by the nucleotide sequence, identified a 22-kD protein as the product of the 14 .1 gene. Immunoprecipitation experiments using antiIgIA antisera demonstrated that at least two proteins, of 16 and 22 kD, are associated with IgIA in human pre-B cells. 305 Taken together, the immunoprecipitation and Western analysis suggested that the protein product encoded by the 14 .1 gene is complexed to Igi. in pre-B cells (9) . Recently, these results have been confirmed and extended by Kerr et al. (10) who have shown that cell surface Igli in pre-B cells is covalently linked to two protein chains, 16 and 22 kD in size, that crossreact with antisera directed against human IgX protein . Similar results have been described for mouse pre-B cells where a surrogate L chain, termed co, has been shown to be complexed with Igu to form a u2co2 tetrater (11). In separate experiments, a cDNA encoding a pre-B cell-specific transcript, M, was isolated that contains homology to Ig J and C regions (12-14). The mouse X5 gene contains the putative coding sequence for the mouse co protein, and based on sequence similarity, is likely to be the mouse homologue of the human 14 .1 gene (9, 15) . Analysis of the 14 .1 gene indicates that it is expressed without gene rearrangement and is encoded in three exons, in which exons 2 and 3 contain Ig J and C region homology, respectively. Characterization of the genomic structure of the 14 .1 gene led to the discovery that the human Iga1 gene, the most 5' gene of the seven gene X locus, is organized in a similar three-exon structure. This germline IgX1 gene contains appropriate RNA processing signals that could lead to J . Exp. Med . ® The Rockefeller University Press - 0022-1007/91/02/0305/07 $2 .00 Volume 173 February 1991 305-311 the production of a transcript with one long open reading frame with an in frame AUG start site . This analysis suggests that the IgX1 gene may be expressed in two ways . One transcript would be expressed from the germline gene and include all three 14 .1-related exons and the second transcript would be produced from a rearranged IgX1 gene after V-j recombination and would encode a classic IgX L chain protein. Materials and Methods DNA and RNA Analysis. Plasmid DNAs (1 Fcg/lane) were digested with the indicated restriction enzyme according to manufacturers recommendations, fractionated on a 0.8% agarose gel, and transferred to nitrocellulose paper (16) . RNA extraction and blots were performed as described (9, 17). DNA and RNA blots were hybridized to the following "P-labeled human DNA probes : (a) 847-bp fragment including all of the 14 .1 cDNA clone Hom-1 (9); (b) 300-bp SstI-BamHI fragment from clone 14 .1, which contains homology to human IgXC (probe A) ; (c) a 319-bp fragment containing 14 .1 exon 1 sequences generated by PCR amplification using the oligonucleotides 5'-GGCCACATGGACTGGGGTGC-3' and 5'-CCACCGGCTCCTCAGGCTGG-3' (probe B) (as recommended by Perkin-Elmer Corp., Norwalk, CT); (d) 200-bp Smal fragment from clone 16 .1, which shares >95% sequence similarity with exon 2 of 14.1; (e) a 60-bp fragment from nucleotide position 78-137 of GA1 exon 2 (probe D) . DNA and RNA blots were hybridized overnight at 42°C in a 10% dextran sulfate, 4 x SSC, 40% formamide, 0.8% Denhardt's Tris buffered solution. After hybridization, filters were washed with 2x SSC, 0.1% SDS three times at room temperature, and with 0.1x SSC, 0.1% SDS two times for 20 min at 55 °C before autoradiography. Genomic Cloning. A human placental DNA bacteriophage library (106 bacteriophage) or an MboI partial digest of DNA from a human pre-B cell line HPB Null inserted into Stratagene's A DashII vector (5 x 105 bacteriophage) were screened with probes from 14 .1 exon 1 or exon 3 as described previously (8, 17) . Positive bacteriophage were plaque purified, and mini-lysate DNAs were prepared. Insert fragments from these clones were subcloned into Bluescript (Stratagene) for restriction mapping and DNA sequence analysis . Restriction digests were performed with indicated enzymes under conditions recommended by the manufacturer (Bethesda Research Laboratories, Gaithersburg, MD). Nucleotide Sequence Analysis. The DNA sequence analysis was determined by the dideoxy chain termination method using the USB Sequenase DNA sequencing kit. Synthetic oligonucleotides (Research Genetics) were used as sequencing primers. Nucleic acid alignments and translations were done using the University of Wisconsin (Madison, WI) Sequence analysis software package (18) . Results Genomic Structure of the IgA-like Gene 14 .1 . Our original identification and isolation of the pre-B cell-specific 14.1 gene included a partial genomic characterization (7). With the isolation of the 14 .1 cDNA, it became apparent that upstream sequences, not found on our original clone, were part of the transcription unit . A human placental DNA library was screened with a Hom-1 cDNA probe and one positive clone f--i 2000 by Meg J1 C1 KeOz E E E E KeOz W W _J` C2 KeOz W J5 C5 J6 C6 J7 0 S Exon 2 Exon 3 r 1000 by I Exon 1 C7 I I I I S H J1 C1 E Exon 2 Exon 3 14 .1 B H Figure 1. Restriction map of human IgX locus, GX1, and 14 .1 genes. A partial restriction map of the seven-gene IgX C region locus on human chromosome 22 is depicted. J and C region sequences are depicted by solid boxes and numbered. An expanded restriction map of the 14 kb EcoRl fragment containing IgX C region 1 is shown. Three coding exons of GX1 gene are depicted by solid boxes and are numbered . E-EcoRl, Bg=Bg1II, H=HindIII, S-Stul (note: not all Stul sites are shown) . A partial restriction map of the 14 .1 gene is shown. The coding exons of the 14 .1 gene are depicted by solid boxes and are numbered . E=EcoP I, B=BamHI, H=HindIII, and X-Xbal . 306 Genomic Structure and Evolution of the Human IgX1 Gene was isolated. A partial restriction map of this clone is shown in Fig. 1 . Experiments using the Hom-1 cDNA as a probe identified three areas of hybridization that define the three coding exons of the 14 .1 gene . Exon 1 is positioned N5 kb 5' of exon 2, which is in turn located -1.2 kb 5' of exon 3 . Sequence analysis of the three areas of homology to the Hom-1 cDNA proved that the clone contained the coding sequence for 14 .1 . These results are consistent with our prior partial characterization of the 14 .1 gene as well as the recently described genomic characterization of 14 .1 by Schiff et al. (15). This analysis identified two dominant bands at 7 .6 and 3 .0 kb in the BamHI digest and a 1 .3- and a 1.1-kb band in the SstI digest (data not shown) . Restriction mapping indicates that exon 1 of 14.1 is present on a 7 .6-kb BamH1 fragment and a 1 .3-kb Sstl fragment (Fig. 1) . To clone the exon 1 sequence contained on the 3 .0-kb BamHI fragment, an Mbol partial genomic DNA library was made from the human pre-B cell line HPB Null . Genomic Southern blot analysis showed that this gene was identical in structure in both pre-B cell and nonhematopoietic DNA (data not shown) . The primary screen of the library was done with the 14 .1 exon 1 probe and positive clones were counter screened with an exon 3 probe. Three clones hybridized with both probes and were plaque-purified . One of these clones contained a 7.6-kb BamHI Genomic Southern Analysis with the 14.1 Exon 1 Probe and Isolation of a 14.1-related Gene. Hybridization experiments using an exon 1 probe from 14 .1 on a BamHl and Sstl genomic Southern blot were performed using the 14 .1 exon 1 probe. xon I -240 -220 -200 -180 -160 GGATTGTGACTGCTTCAGGGCAGTTGGTAGATGCCCCTCTGGGAGAGATCCCCAGGGGTGACAGCCATGGACCCTGGAAGGGCCTGGGCTAGGOACAGGG lilt 1 III111111111 11111 IIII11111 I 1111111 III 1111111111 It 11111111 lilt III 11111 GGATCCTGACCACAGCAGGGCAGTTGGCAGATGTGCCTCTGGGACAGAACCCCAGGGGTAACGGTCATGGACTCTOGNIGGGGCCTGGCTOGGGGCAGGG till 11 -140 -120 -100 -80 GAI 14 .1 -60 ACCAGAGCCAGTCCAGGGAGAGGACAGAGCCAATGGACTG=TGTACTGTAACAGC .CCTGCTGGCGAGAGGGACCAGGGCACCGTCCTCCAGGGAGCC IIIIIII II I I 1111111111111 I II VIII lilt IIIII111111111111111 III IIIIIIII ACCAGAGGATGAGGATGG . . . . . . . .GGCCACATGGACTGGGGTGCAATGGGACAGCTGCTGCCAGCGAGAGOGACCAGGGCACCACTCTCTAWGAGCC -40 -20 1 20 40 N R P K T G Q V G C S T P E 6 L G P G CATGCTGCAAGTCGGGCCAGACMGCCCCTGAACCTGAAGGCCAATOAGACCCAAGACWGCCIAWCOGTTGTCAOACCCCTOAGGACC7QGGCCCTGG II IIIIIIIII VIII I I 11111 IIII VIII II 111111111 1111 111111 11 IIIII CACACTGCAAGTCAGGCCACAAGGACCTCTGACCCTGAGGGCCGATOAGGCCAGGGACAGGCCAGGGGCCCCTTGAGGCCCCTGGTGAGCCAGGCCCCAA It lilt 1 111 11 M R P G T G Q G G L 6 A P 0 S P G P N 60 SO 100 120 GXI 14 .1 140 P Q R W P A P P "0 'R L L* L L G "L A M V A X G' L L R *P M V 0 S' G 0 P TCCCAGGCAGCGCTGGCCCCTGCTGCTGCTGGCTCTGGCCATGGTCGCCCATGGCCTGCTGCGCCCAATGGTTGCACCGCAAAMGGGGACCCAGACCCT I IIIIIIIIIII111111111111IIIIlilt IIII11 IIII IIIIIIIIIIIIII1111111 I CCTCAGGCAGCGCTGGCCCCTGCTGCT'GCTGGGTCTGGCCOTOGTAACCCATGGCCTQCTGCGCCCARCAGCTGCATCGCAGAOCAGGCCCCTGGOCCCT 1111 till III 111 11 1 1111 L R 0 R W P L L L L G L A V V T N C L L R P T A A E Q S 11 A L G P 160 S Exon 2 G A 160 3 200 220 V S ! ! L W' G R G* It ' ll L R GGAGCCTCAGTTGGAAOCAOCCGATCCAGCCTGCCGAOCCTCTOGCGCAGgtaa9999cu9a9atttxa 111111 1 1 IIlilt IIIII IIIIIIIII 111111 III IIIIlilt 111111 I I I ------GGAGCCCCTGGAGGAAGCAGCCGGTCCAGCCTGIIGGAGCCGGTGGGGCAGgtu9ggqtCtgqgtqgcag G A P G G S S R S 3 L R S R W G R 30 30 50 70 Intron - 5 KD-------- GLI 14 .1 90 -L L L Q P 5 P' 0 R. A 'D P R C W P x* G P W ' E 9 P Q ! L C* tgcecgtcatgcceaqcpGCTCCTGCTCCAGCCCAGCCCCCAGAGAGCIGACC .TGCTGGCCCCOGG.G=T .CT .CTCAG7CACT .T. IIII I III IIIIIIIII II11lilt 1111 I II II II IIIIIIIIIIIIII11111111111 II III I I III 11 eqcccctcacteccaga9GTTCCTGCTCCA000CGGCTCCTGGACT . . .GGCCCCAGGTGCTCCCCCCCCGGGTTTCAATCCAAGCATAACTCAGTGAC V F L L Q R G E W T G P R C W P R G F Q ! K X N 5 T 110 130 Y V F 0 T G T K V T' V L G TTATGTCTTCGGAACTGGGACCAAGG7CACCGTCCTAGgtaa9tg IIII II II I 111111 II 1111111 II 1111111---------------GCATGTGTTT000AGCGCCACCCACCTCACCGTTTTAAgtaaqtq X V F 0 S G T L T V L S Q Exon 3 Intron - 1 .2 Kb --------------------- GAI 14 .1 10 Q P R A N P T V T L 'F P P S 3 S Z* L Q A N K A 7 L V C cettecct9tceacaca9GTCA000CAAGGCCAACCCCACGGTCACTCTGMCCCGCCCTCCTCTGAGGAGCTCCARGCCAACAAGGCCACACTAGTGTG III 1111111111111111111111111111 IIII III1111111111111111 III11111111111111111111111111 11111 IIIII ceteecetgteueaCaqG7C1W000AACGCCTCCCCCTCGOTCACTCTGFTCCCGCCGTCCSCTGAGGACCTCCAAACCAACAAGGCTACACTGGTOTG Q P K A 7 P 3 V 7 L F P P 3 ! L S L 0 A N R A 7 L V C 110 130 150 170 230 250 270 14 .1 GAI 14 .1 330 c V A .P T I 3 V e X* T CcGTGGAGAAGACAGTGGCCeCTACAGAATGTTCATAGGTCCCCAACTCTAAGGCCCA=ACGGGAGCCTGGAGr?GCAGGATCCCAGGCGRGGCgTCT 1111111111111 111111111 IIIIIIIIIIIII111 lilt I I I 11 II I II IIIIIIIIIII1111111111111111111111 CCGTGGAGAAGACGGTGGCCCC7GCAGAATGTTCATAGGTTCCCAGCCCCGACCCCACCCAAAGG . .GCCTGGAGCTGCAGGATCCCAGGGGAGGGGTCT V S K T V A P A E C ! 420 GAI 290 S Y L E L T P 3 Q 41 K S N R S Y S C 0 V T X K G 3 T 5 M N K Y A A S AGCAACAACAAGTACGCGGCCAGCAGCTACC7GAGCCTGACGCCCGAGCAGTGGAAGTCCCACAOAAGCTACAGCTGCCAGGTCACGCATGAAGGGAGCA IIIIIIIIIIIIIIIIIII111111111111111111111111111111111111 VIII 11111111111111111111111 III IIIIIII111 AGCI1CMCAAGTACA000CC1GCGCYACC7CAGCCTGACGCCCOAGCAGTGORGGTCCCGCAGAAGCTACAGCTGCCAGGTCATGCACOINIGGOAGCI 3 K Y A A ! E Y L ! L T P C Q W P ! R R E Y E C 0 V M X E G 5 7 S N .X 310 14 .1 190 F Y P G A V* T V A "W R A D G E P " V K A 'G V K T T R P * 5 K Q L I 5 *D TCTGATCAGTGACTTCTACCCGGGA=TGTGACAGTGGCTTGGAAGOCAGATGCCAGCCCCGTCAAGGCGGGAGTGCAGACGACCAAACCCTCCAAACAG III II 1111111 II 111111 lilt 111 I IIIIIIIIIIIIII I lilt III III 1111111 IIIII IIIIIIIII111 TCTCATGAATGACTTTTATCCGGGAATCTTGACGGTGACCTGGAAGGCAGATGGTACCCCCATCACCCAGGGCGTGGAGATGACCACGCCCTCCAAACAG N D F Y P G I L T V T W K A D G T P I T 0 G V K M T T P ! K Q L M 210 GXI 630 450 CTCTCCCCATCCCAAGTCATCCAGCCCTTCTCCCTGCACTCATGAAACCC .C 'AMTATCCICATT 111 111111 11111111 IIII 11 IIIII 111 111111111 it II CTCTCTCCATCCCAAGCCATTCIGCCCTTCTCCCTGTRCCCAGTAAACCCTC"'VRTGCCCTCTTT 111111 111111 307 Evans and Hollis G1y1 14 .1 GAI 14 .1 Figure 2 . Nucleotide sequence of the 3 exons of the GX1 and 14 .1 genes. The nucleotide sequence of the GX1 gene is compared to the 14.1 gene. All three exons and the predicted amino acid sequence of both genes are shown with the approximate size of introns. Intron sequences are shown in lower case letters. Numbering of exon 1 begins with the start methionine +1. Consensus polyA addition signal is indicated by underline . fragment diagnostic of 14 .1 and was not characterized further. The remaining two clones contained a 3 .0-kb BamHI fragment that hybridized to the 14.1 exon 1 probe and restriction map analysis indicated that the two clones shared common sequence . One of these clones, GA1, was analyzed in more detail and shown to contain a 14-kb EcoRI fragment . A partial restriction map of this clone was determined (Fig . 1). Comparison of the GX1 restriction map to clones containing the seven human IgX C region genes indicated that the GA1 clone was similar to the most 5' gene lgX1 (8, 19, 20) . Sequence of the region homologous to IgXC of clone GA1 was identical to the previously determined sequence for IgX1 C region confirming the nature of the clone (Fig . 2). Sequence of the exon 1 homologous region identified a potential coding exon similar in structure to exon 1 of 14 .1 . This exon had one long open reading frame beginning with a start ATG capable of encoding 68 complete amino acids . This exon has a splice donor site splitting amino acid 69 that is identical to the splice donor site found at the same position of 14.1 and is a good match for a consensus splice donor site (21). This exon shows 81% and 71% homology with 14.1 exon 1 at the nucleotide and amino acid sequence level, respectively. This high degree of similarity to 14.1 exon 1 explains its identication on Southern blot and its subsequent cloning using a 14.1 exon 1 probe . Analysis of the sequence upstream of the cDNA start failed to identify a characteristic TATA or CAT box. Search of the 5' sequence failed to identify the octamer motif ATGCAAAT that has been implicated in tissue-specific expression of Ig genes (22-27) . To determine if the GX1 clone contained a sequence analogous to exon 2 of 14.1, we sequenced the area surrounding the IgX1 J region. This analysis revealed an open reading frame capable o£ encoding 40 amino acids, one amino acid longer than exon 2 of 14.1, with a consensus splice acceptor site at its 5' end and a consensus splice donor site at its 3' end (Fig . Table 1 . Nucleotide and Amino Acid Comparison of 14.1, GA1, and J15 Nucleotide Exon 1 2 3 Amino acid 14 .1 GXI A5 14 .1 - 81 61 GXI 71 - 58 X5 48 50 - 14 .1 - 72 66 GX1 55 - 75 X5 46 52 - 14 .1 - 89 72 GXI 84 - 77 X5 61 62 - Comparison of nucleotide and amino acid sequence similarity of 14 .1, GX1, and Mouse X5 ; Comparison of nucleotide sequence and amino acid sequence is shown in percent identity for 14 .1, GX1, and mouse X5 . This sequence includes the heptamer and nonamer recombination signal sequences found upstream of all J regions and in this exon would be used as coding sequence . This putative exon shares 72% and 55% similarity with the 14 .1 exon 2 sequence at the nucleotide and protein level, respectively. The structure of this putative exon is such that it can join this sequence to the exon 1 and 3 sequences in an open translational reading frame. This newly identified gene, GX1 (for germline IgX1), is capable of encoding a protein of 214 amino acids with a mol 2) . Figure 3. Northern analysis of preB cell RNA with GX1 specific probe. (A) 1 tLg of plasmid DNA containing 14.1, 16 .1, and GXI gene sequences were digested with BamHI and EcoRI (14 .1 and 16 .1) or EcoRl and HindIII (GX1) and Southern blotted. (B) Total RNA from five pre-B cell lines (Reh, RS4 ;11, Nalml, Nalm6, and HPB Null), two B cell lines (BL33 and DHIJ6), and Hela cells were analyzed by Northern analysis . DNA and RNA blots were hybridized with the GXI exon 2-specific probe D. 308 Genomic Structure and Evolution of the Human IgX1 Gene wt of 23,045 . This protein would contain one extra amino acid and have a structure similar to the 14.1 gene product . Interestingly, the structure of this gene suggests that it can be expressed in an unrearranged form to produce a protein derived from exons 1-3 of GA1 or, after V-J rearrangement, can produce a functional IgA L chain protein . Northern Analysis with GA1 Speck Probe. We have previously shown that expression of the 14 .1 gene is limited to pre-B cells. We wished to determine if the GM gene showed a similar pattern of expression . Because the 14 .1 and GM genes share sequence similarity, we prepared a GX1 exon 2 probe from a region where the two genes were most dissimilar (63%). This probe does not crosshybridize with exon 2 sequences of 14 .1 or the related IgMike gene 16 .1 (Fig. 3 A) . We used the GM exon 2 fragment to probe RNA prepared from 5 pre-B cell lines, 2 B cell lines, and HeLa cells . Unlike a similar probe from 14 .1, the GM exon 2 probe did not hybridize to RNA derived from any of the pre-B cells (Fig. 3 B). Only the Iga-producing B cell DHL6 expressed RNA homologous to exon 2 of GX1 . A similar size transcript was identified when RNA from DHL6 was probed with a fragment containing sequence homologous to exon 3 of GM (data not shown) . Expression analysis of the GM gene is being expanded to examine a larger number of cell lines and to look for GM protein . The Genomic Organization of the Murine Immunoglobulin Related Gene, A5, Is Similar To 14.1 and GAL Like 14.1 and GM, mouse A5 is encoded by three exons in which exon 3 contains Ig C region homology, exon 2 contain Ig J region homology as well as 79 nucleotides of 5' sequence, and exon 1 contains sequence unrelated to Ig V regions (28) . Sequence comparisons at both the nucleotide and amino acid level reveal striking homology between 14 .1, GM, and mouse X5 through all three exons (Table 1) . For instance, exons 1, 2, and 3 of mouse A5 are 58, 75, and 77% similar, respectively, to GM in their nucleotide sequence. We and others have identified mouse A5 as the murine homolog of 14 .1 (9, 15) . Based on genomic structure and a high degree of sequence similarity it appears that mouse X5, human 14 .1, and GM are all derived from a common ancestor. Discussion Specific DNA rearrangement thatjuxtaposes V and j regions (and in the case of Ig H chain, diversity regions) is a prerequisite for expression of functional Ig H and L chain proteins . These events, which are readily detected by Southern analysis on clonal B cell populations, have been used with great success to mark B cell development (1-6) . These studies have shown that IgA. rearrangement and expression occurs at the pre-B cell stage and precedes Ig L chain rearrangement . Recently, the pre-B cell Igp, has been found to be expressed as part of a protein complex on the cell surface (10, 11) . This complex appears to include the product of the 14 .1 gene and another pre-B cell-specific gene product, VpreB, which encodes an Ig V gene-related sequence (12, 29, 30) . In this complex, the VpreB and 14 .1 gene products together encode V+C region sequences, the components found in an Ig L chain 309 Evans and Hollis protein . Their presence on the cell surface suggests that this IgA-surrogate L chain complex may mark pre-B cells that have undergone functional IgH gene rearrangement . These cells may then proceed on to rearrange and express Ig L chains. The signals required for Ig L chain gene rearrangement are unknown, but it will be interesting to determine if the formation of an IgA-surrogate L chain complex on the pre-B cell plasma membrane is a necessary step for B cell differentiation . The human 14 .1 gene is encoded by 3 exons of which exons 2 and 3 share a high degree of similarity with Ig J and C regions, respectively. Unlike classical Ig L chains, the 14 .1 gene is expressed from a germline gene. The high degree of conservation of the Ig C region domain in the third exon of the 14 .1 gene presumably affords the necessary contacts for IgH-14 .1 interactions. The NHZ-terminal half of the 14 .1 protein, encoded by exon 1 and the 5' half of exon 2, does not share homology with Ig V region genes (7, 9, 15) . Surprisingly, like 14 .1, the most 5' gene of the human X locus, IgX1, was shown to be organized in a similar threeexon structure in which exon 3 corresponds to the IgX1 C region and exon 2 includes IgX1 J region sequence. While this gene could be an evolutionary remnant, analysis of these three-exons indicate that this germline gene, termed GM, contains several hallmarks of a functional gene. First, all three exons are open reading frames. Second, an initiating ATG is found in frame in the putative exon 1 sequence. Third, consensus splice donor and acceptor sites are found at the intron-exon junctions. Finally, a poly A addition site is found 110 by downstream of exon 3 . Transcription and processing of these three exons would produce a transcript containing an open reading frame capable of encoding a protein with a mol wt of 23,045 . This protein would use germline sequences found upstream of exon 2 as coding sequence and the transcript would include the recombining nonamer and heptamer signal sequences that are normally deleted by a VJ rearrangement event . These results suggest that the IgX1 gene can be expressed in two different manners . First, as we had previously described, as a classic IgX L chain after Ig VJ gene rearrangement (8) . Second, the IgX1 gene may be expressed in an unrearranged form (GX1) utilizing the three exons related to the 14 .1 gene. Because the exon 1 and 5' portion of exon 2 are in the region deleted by VJ rearrangement, this gene must be expressed before IgA VJ joining in B cell development and/or be expressed in other cell types. Based on the high degree of similarity between the pre-B cell-specific gene 14 .1 and the GA1 clone, we examined the expression of GX1 in human pre-B cells . Our Northern analysis indicated that unlike 14 .1, the GA1 clone was not expressed in pre-B cells . Of the cells examined, only an Igrc-producing B cell, DHL 6, showed any GM transcript . This transcript was -800 by in size and therefore slightly smaller than the 14 .1 transcripts seen in pre-B cells. The biological role of this germline transcript is unknown at this time. Transcription of germline Ig genes is well documented in both IgH and K (for review see reference 31) . Studies of these germline RNAs have shown that they can undergo RNA processing (32) . While most of these transcripts are believed to be "sterile" (incapable of producing a functional protein), at least in one case a truncated Ig H chain protein is produced . Several studies have demonstrated that germline transcription of these genes precedes gene rearrangement and may be necessary to open the chromatin structure to the recombination machinery. The expression of GX1 indicates that transcription can occur from the germline human IgA locus . This germline IgA transcript could be a forerunner of V-j gene rearrangement . Alternatively, because we have shown that there are three exons that can be spliced to produce an open reading frame, this transcript may lead to the expression of the GX1 sequences at the level of protein . A more extensive analysis of GX1 expression is being pursued to determine if transcription of the GX1 gene leads to production of GX1 protein . Gene duplication from a single A J and C region segments has given rise to the tandemly repeated J-C region structure of the present day seven-gene human IgA locus. This gene duplication appears to have occurred after mouse and man diverged (7, 8, 19, 20) . The shared three-exon structure of the 14 .1 and GX1 genes indicates that they arose by gene duplication from a common ancestor. Sequence comparisons of the 14 .1, GAl, and the mouse A genes indicate that 14 .1 and GX1 are more similar to each other (exon 3 : >89%) than to mouse A genes (exon 3 : 74% 14 .1 and 75% GX1 to mouse A1) . This suggests that the gene duplication that gave rise to 14.1 and GX1 followed divergence of mouse and man . Similarly, GX1 is more similar to the remaining genes of the seven-gene A locus than to 14 .1 (exon 3 : >95% vs . 89%), suggesting that gene duplication of the seven-gene cluster occurred after GA1 and 14 .1 duplicated . This comparison suggests that GX1 may have been the original human IgA gene whose exon 2 and 3 sequences, which contain J and C regions, were duplicated to give rise to the present day human X locus . The conserved three-exon motif of 14.1 and GX1 is unlikely to have arisen recently, because the murine homologue of 14 .1, mouse X5, is organized in a similar fashion (28). This suggests that the three-exon structure found in 14 .1, GA1, and A5 existed in an ancestral gene before mouse-man divergence . Like man, the mouse has multiple copies of X genes that appear to have arisen by gene duplication . A comparison of 14 .1 to -the four mouse X genes and A5 reveals that 14 .1 is as similar to mouse X 1-4 (exon 3 : 72-74%) as it is to mouse A5 (exon 3: 72%) (34) . This suggests that the duplication that produced mouse X genes 1-5 occurred from a single gene after mouse and man diverged . Based on these comparisons, we feel that the ancestral gene to 14 .1/GAl/A5 and the present day Ig X genes of mouse and man is likely to have been organized in the three-exon structure seen in 14 .1, GA1, and mouse A5. This primordial IgA gene may have been expressed as an IgA 14 .1-like protein from the germline gene or, after VJ recombination, as a canonical IgA L chain . The results described above lead to the question, is the Vpre-B-14 .1 surrogate L chain complex found associated with IgA. in pre-B cells adapted from the V-j recombining IgA genes or does it represent a primordial two-peptide Ig L chain that was later adapted to produce the single-peptide IgA L chains? Examination of this system in more primitive immune systems may help answer this question . We thank J . Stafford-Hollis, I. R. Kirsch, and W. M. Kuehl for critically reading the manuscript and S. Sammons for help with computer analysis . Address correspondence to Gregory F. Hollis, Monsanto Company, AA4C 700 Chesterfield Village Parkway, St. Louis, MO 63195. Received for publication 24 August 1990 and in revised form 19 October 1990. References Alt, F., N. Rosenberg, S. Lewis, E. Thomas, and D . Baltimore. 1981 . Organization and reorganization of immunoglobulin genes in A-MuLV transformed cells : rearrangement of heavy but not light chain genes. Cell. 27 :381 . 2. Korsmeyer, S .J., P.A . Hieter, J .V. Ravetch, D.G. Poplack, T.A . Waldmann, and P. Leder. 1981 . Developmental hierarchy of immunoglobulin gene rearrangement in human leukemic preB cells . Proc Natl. Acad. Sci. USA. 78:7096. 3 . Maki, R., J . Kearney, C. Paige, and S. Tonegawa. 1982 . Immunoglobulin gene rearrangements in immature B cells. Science (Wash. DC). 209:1366. 4. Perry, R ., D. Kelley, C. Coleclough, and J . Kearney. 1981. Or- ganization and expression of immunoglobulin genes in fetal liver hybridomas. Proc. Natl. Acad. Sci. USA. 78 :247 . 5. Reth, M ., P. Ammirati, S . Jackson, and F. Alt . 1985 . Regu310 lated progression of a cultured pre-B cell stage . Nature (Lond.). 317 :353 . 6. Tonegawa, S. 1983 . Somatic generation of antibody diversity. Nature (Lond.). 302:575 . 7. Chang, H ., E. Dmitrovsky, P. Hieter, K. Mitchell, P. Leder, L. Turoczi, I . Kirsch, and G. Hollis. 1986 . Identification of three new Ig Mike genes in man . -j. Exp Med. 163:425 . 8 . Hieter, P., G. Hollis, S . Korsmeyer, T Waldmann, and P. Leder. 1981 . Clustered arrangements of immunoglobulin X constant region genes in man. Nature (Lond.). 294:536 . 9 . Hollis, G.F., R .J . Evans, J .M. Stafford-Hollis, S .J. Korsmeyer, and J.P. McKearn. 1989 . Immunoglobuli n X light chain-related genes 14 .1 and 16 .1 are expressed in pre-B cells and may encode the human immunoglobulin w light chain protein . Proc. Natl. Acad. Sci. USA. 86 :5552. Genomic Structure and Evolution of the Human IgX1 Gene 10 . Kerr, WG ., M.D. Cooper, L. Feng, P.D. Burrows, and L.M . Hendershot .1989. Mu heavychains can associate with a pseudolight chain complex (~L) in human pre-B cell lines. International Immunology. 1:355 . 11 . Pillai, S., and D. Baltimore. 1987 . Formation of disulphidelinked P.2w2 tetramers in pre-B cells by the 18K w-immunoglobulin light chain. Nature (Lond.). 329:172 . 12 . Kudo, A., and F. Melchers . 1987 . A second gene, Vpre-B in the X5 locus of the mouse, which appears to be selectively expressed in pre-B lymphocytes . EMBO (Eur. Mol. Biol. Organ) J. 6:2267. 13 . Sakaguchi, N., C. Berger, and F. Melchers . 1986 . Isolation of a cDNA copy of an RNA species expressed in murine pre-B lymphocytes. EMBO (Eur. Mol. Biol. Organ .) J. 5:2139. 14 . Sakaguchi, N., and F. Melchers . 1986 . X5, a new light chainrelated locus selectively expressed in pre-B lymphocytes . Nature (Loud.). 324:579 . 15 . Schiff, C., M. Bensmana, P Gughelmi, M. Milili, M.P. Lefranc, and M . Fougereau. 1990 . The immunoglobulin X-like gene cluster (14.1, 16 .1, and FX1) contains genes selectively expressed in pre-B cells and is the human counterpart of the mouse X5 gene . International Immunology. 2:201 . 16 . Southern, E. 1975 . Detection of specific sequences amongDNA fragments separated by gel electrophoresis. J. Mol. Biol. 98 :503 . 17 . Gazdar, A., H. Oie, 1. Kirsch, and G. Hollis. 1986 . Establishment and characterization of a human plasma cell myeloma culture having a rearranged cellular myc proto-oncogene. Blood. 67 :1542 . 18 . Devereux, J., P. Haeverli, and O. Smithies. 1984 . A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 12 :387. 19 . Dariavach, P., G. Lefranc, and M. Lefranc. 1987 . Human immunoglobulin CX6 gene encodes the Kern+Oz - X chain and CX4 and CX5 are pseudogenes. Proc Natl. Acad. Sci. USA . 84 :9074. 20 . Vasiceck, TJ ., and P. Leder. 1990 . Structure and expression of the human immunoglobulin X genes. J. Exa Med. 172:609 . 21 . Shapiro, M.B., and P. Senapathy. 1987 . RNA splice junctions of different classes of eukaryotes : sequence statistics and functional implications in gene expression . Nucleic Acids Res. 15 : 7155 . 31 1 Evans and Hollis 22 . Calame, K.L . 1985 . Mechanism that regulates immunoglobulin gene expression . Annu. Rev. Immunol. 3:159. 23 . Falkner, F.G ., and H.G. Zachau . 1984 . Correct transcription of an immunoglobulin K gene requires an upstream fragment containing conserved sequence elements. Nature (Lond.). 310:71. 24 . Landolfi, N.F., J.D. Capra, and P.W. Tucker. 1986 . Isolatio n of cell-type specific nuclear proteins with immunoglobulin V promoter region sequence. Nature (Loud.). 323:548 . 25 . Mitchell, P.J ., and R. Tjian. 1989 . Transcriptional regulation in mammalian cells by sequence-specific DNA binding proteins. Science (Wash. DC). 245:371 . 26 . Parslow, TG., D.L . Blair, W.J . Murphy, and D.K. Granner. 1984. Structure of the 5' ends of immunoglobulin genes: a novel conserved sequence. Proc Nad. Acad. Sci. USA . 74 :560. 27 . Wirth, T, L. Staudt, and D. Baltimore. 1987 . An octamer oligonucleotide upstream of a TATA motif is sufficient for lymphoid-specific promoter activity. Nature (Lond.). 329:174 . 28 . Kudo, A., N. Sakaguchi, and F. Melchers . 1987 . Organization of the murine Ig-related X5 gene transcribed selectively in pre-B lymphocytes. EMBO (Eur. Mol. Biol. Organ) J. 6:103 . 29 . Bauer, S., K. Huebner, M. Budarf, J. Finan, J. Erikson, B. Emanuel, P. Nowell, C. Croce, and F. Melchers . 1988 . The human Vpre-B gene is located on chromosome 22 near a cluster of VX1 gene segments. Immunogenetics. 28 :328 . 30 . Bauer, S., A. Kudo, and F. Melchers . 1988 . Structure and pre-B lymphocyte restricted expression of the Vpre-B gene in humans and conservation of its structure in other mammalian species. EMBO (Eur. Mol. Biol. Organ.) J. 7:111 . 31 . Blackwell, TK., and F.W. Alt. 1989 . Mechanis m and developmental program of immunoglobulin gene rearrangement in mammals. Annu . Rev. Genet. 23 :605 . 32 . Martin, DJ ., and B.G. Van Ness. 1990 . Initiation and processing of two K immunoglobulin germ line transcripts in mouse B cells. Mol. Cell. Biol. 10 :1950. 33 . Reth, M.G., and F.W. Alt. 1984 . Nove l immunoglobulin heavy chains are produced by DJH gene segment rearrangements in lymphoid cells. Nature (Loud.). 317:418 . 34 . Selsing, E., J. Miller, R. Wilson, and U. Storb. 1982 . Evolution of mouse immunoglobulin X genes. Proc Nad. Acad. Sci. USA . 79 :4681.