* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download A Pneumocystis carinii multi-gene family with
Molecular cloning wikipedia , lookup
Biosynthesis wikipedia , lookup
Genetic engineering wikipedia , lookup
Transposable element wikipedia , lookup
Genomic imprinting wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Gene expression wikipedia , lookup
Gene therapy wikipedia , lookup
Molecular ecology wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Gene nomenclature wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Copy-number variation wikipedia , lookup
Gene regulatory network wikipedia , lookup
Non-coding DNA wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
Gene expression profiling wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Gene desert wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Genomic library wikipedia , lookup
Catalytic triad wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Point mutation wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Microbiology(1997), 143,2223-2236 Printed in Great Britain A Pneumocystis carinii multi-gene family with homology to subtilisin-like serine proteases Elena B. Lugli, Andrew G. Allent and Ann E. Wakefield Author for correspondence: Ann E. Wakefield. Tel: +44 1865 222344. Fax: +44 1865 222626. e-mail: [email protected] Molecular Infectious Diseases Group, Department o f Paediatrics, lnstitute of Molecular Medicine, John Radcliffe Hospita't Oxford OX3 9DUt UK Copies of a multi-gene family, named PRTl (protease 3,encoding a subtilisinlike serine protease were cloned from the opportunistic fungal pathogen Pneumocystis carinii. Comparison of the nucleotide sequence of a genomic clone and a cDNA clone of PRTl from P. carinii f. sp. carinii revealed the presence of seven short introns. Several different domains were predicted from the deduced amino acid sequence: an N-terminal hydrophobic signal sequence, a pro-domain, a subtilisin-like catalytic domain, a P-domain (essential for proteolytic activity), a proline-rich domain, a serinerthreoninerich domain and a C-terminal hydrophobic domain. The catalytic domain showed high homology to other eukaryotic subtilisin-like serine proteases and possessed the three essential residues of the catalytic active site. Karyotypic analysis showed that PRTl was a multi-gene family, copies of which were present on all but one of the P. carinii f. sp. carinii chromosomes. The different copies of the PRTl genes showed nucleotide sequence heterogeneity, the highest level of divergence being in the proline-rich domain, which varied in both length and composition. Some copies of PRTl were contiguous with genes encoding the P. carinii major surface glycoprotein. Keywords : Pneumocystis carinii, subtilisin-like serine protease, PRTl , multi-gene family, major surface glycoprotein INTRODUCTION The fungal pathogen Pneumocystis carinii causes potentially fatal pneumonia in the immunocompromised, including those receiving immunosuppressive therapy for organ transplantation, those with advanced malignancy and in particular those with HIV infection. The lack of an effective in vitro culture system still remains a major obstacle in the understanding of the biology of P. carinii and its interactions with its host. Molecular techniques have been employed in the study of the organism, and a number of genes have now been cloned. Among these is the multi-gene family encoding the major surface glycoprotein (MSG or gpA) of the parasite. In this paper we describe the cloning and characterization of a second P. carinii multi-gene family, PRTl (protease I),some copies of which are contiguous with MSG. t Presentaddress: Department of Clinical Veterinary Medicine, University of Cambridge, Cambridge CB3 OES, UK. The GenBank accession numbers for the PRT7(73j) and PRTl(Paga) sequences reported in this paper are AF001304 and AF001305 respectively. 0002-1687 0 1997 SGM The P. carinii major surface glycoprotein is highly mannosylated and is antigenically distinct in organisms isolated from different mammalian host species (Lundgren et al., 1991 ; Gigliotti, 1992). The MSG multi-gene family has been identified in the genome of P. carinii f. sp. carinii (rat-derived P. carinii) (Kovacs et al., 1993; Wada et al., 1993; Sunkin et al., 1994), P. carinii f. sp. mustelae (ferret-derived P. carinii) (Haidaris et al., 1992 ; Wright et al., 1995), P. carinii f. sp. hominis (humanderived P. carinii) (Stringer et al., 1993; Garbe & Stringer, 1994) and P. carinii f. sp. muris (mouse-derived P. carinii) (Wright et al., 1994). The different copies of the P. carinii f. sp. carinii MSG genes are of similar size but heterogeneous in sequence. They have been found on multiple chromosomes and often organized in tandem arrays. The majority of MSG genes are located in the subtelomeric regions of the P. carinii f. sp. carinii chromosomes (Underwood et al., 1996; Sunkin & Stringer, 1996). The expression of MSG genes has been shown to be mediated by the upstream conserved sequence (UCS) which is found on a single chromosome situated in the subtelomeric region (Wada et al., 199.5). Different copies of the MSG multi-gene family have been shown to be linked to the UCS. It has been postulated Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 2223 E. B. LUGLI, A. G. A L L E N a n d A. E. W A K E F I E L D that this differential expression of MSG may represent a strategy to evade the immune response of the host by antigenic variation (Sunkin & Stringer, 1996). The genes comprising the novel P. carinii PRTl multigene family described in this paper show high levels of homology with subtilisin-like serine proteases. These are a group of endoproteases which have been characterized from a wide variety of organisms including bacteria, fungi and higher eukaryotes. Some have been found to function in the specific endoproteolytic processing of pro-proteins at cleavage sites of paired basic amino acid residues, to generate regulatory proteins in a mature and biologically active form. The pro-hormone processing enzyme kexin, encoded by the KEX2 gene of Saccharomyces cereuisiae, has been characterized and found to cleave the precursors of the a-mating-factor and the killer toxin (Fuller et al., 1989). Genes encoding a similar processing endoprotease have been identified in a number of other fungi, the KEXl gene from the yeast Kluyueromyces lactis (Tanguy-Rougeau et al., 1988), the gene encoding the KEX2-related protease (krp) from Schizosaccharomyces pornbe (Davey et al., 1994) and the XPR6 gene from Yarrowia lipolytica (Enderlin & Ogrydziak, 1994). Mammalian homologues have also been identified, including the human fur gene (fes upstream region) in the region upstream of the fes proto-oncogene, encoding the enzyme furin (van den Ouweland et al., 1990). The genes Dfurl and Dfur2 from the insect Drosophila melanogaster encoding furinlike proteins (Roebroek et al., 1992) and the 61i-4 gene from the nematode Caenorhabditis elegans have also been studied. Many other members of the subtilisin-like serine protease family have been identified and the specific endoproteolytic activity of some of them has been elucidated. However, for many others, the precise biological function has not yet been determined. In this paper we report the identification and characterization of the P. carinii f. sp. carinii PRTl multigene family. We demonstrate high levels of homology of the PR TI sequences to subtilisin-like serine proteases. We also show that different copies of the PRTl genes display D N A sequence heterogeneity and some copies are contiguous with MSG genes. METHODS P. carinii DNA extraction. P. carinii infection was induced in Sprague-Dawley rats by steroid immunosuppression. The organisms were isolated and enriched from infected rat lung tissue by the method described by Peters et al. (1992). Total DNA was extracted from the enriched parasite preparation by digestion with proteinase K (1 mg ml-l) in the presence of 0.5% SDS and 10 mM EDTA (pH 8.0) at 50 "C for 16 h, followed by phenol :chloroform extraction and ethanol precipitation. Samples of DNA for use in PFGE experiments were prepared in SeaPlaque GTG agarose as described by Banerji et al. (1993). Isolation of copies of the PRTl gene from P. carinii f. sp. carinii genomic and cDNA libraries. A copy of the PRTI gene was isolated from an unamplified genomic library from P. carinii f. sp. carinii constructed in IEMBL3 (Banerji et al., 2224 1993). The library was screened with a cDNA clone containing a region of a P. carinii f. sp. carinii MSG gene (EMBL accession number 20870, donated by C. J. Delves and F. Volpe), as part of a study examining subtelomeric sequences in P. carinii. A relatively high number of recombinant plaques gave positive hybridization signals compared to the number when the library was screened with a probe derived from the single copy arom locus (Banerji et al., 1993). Five recombinant phages were isolated from a tertiary screen; the recombinant DNA was subcloned into the plasmid vector pBluescript I1 prior to sequencing. T o isolate a full cDNA clone, a P. carinii f. sp. carinii cDNA library constructed in AZAPII (donated by C. J. Delves and F. Volpe, see Dyer et al., 1992) was screened with PCR products derived from amplification of the 5' end of the gene with oligonucleotide primer pair pcprot9 and prp4r (9/4r product), and of the 3' end of the gene with pcprotl3/RI and pcprotl2/RI (13/12 product) (Fig. 1). A primary screening was carried out using both probes, and secondary and tertiary screens were carried out using only the 9/4r product. The number of positive clones when screening the cDNA library with the two probes appeared to be relatively high when compared to the number obtained using a single copy gene. Four recombinant phage isolated from the cDNA library were partially characterized. The recombinant DNA was recovered from the 1 phage by in vivo excision as pBluescript plasmid DNA. The size of the recombinant DNA ranged from 2-7kb to 2.9 kb, and sequence analysis revealed that all four clones contained a poly(A) tail. One recombinant, 73j, was selected for further analysis and the recombinant DNA was sequenced in full from both strands. DNA amplification. Oligonucleotide primers were designed to hybridize to various regions of the P. carinii PRTI nucleotide sequences (Fig. 1, Table 1). Some oligonucleotides had an EcoRI restriction endonuclease site incorporated at the 5' end to facilitate cloning of the amplification products into EcoRIdigested plasmid vectors pBluescript SK( - ) (Stratagene) or pUC18 (Pharmacia). The final concentration of the amplification reaction mix was 50 mM KCl, 10 mM Tris (pH 8-0), 0.1 '/o Triton X-100, 3 mM MgCl,, 400 pM (each) deoxynucleoside triphosphate, 1 pM oligonucleotide primer and 0025 U T a q polymerase ml-l (Promega). With primer pair pcprot9 and pcprotl0, 40 cycles of amplification was performed at 94 "C for 1.5 min, 53 OC for 1.5 min, and 72 "C for 2-0 min. With primer pair pcprot9 and pcprot4r the same conditions were used, except an annealing temperature of 50 "C was used. With all other primer pairs, ten cycles of amplification were carried out at 94 OC for 1.5 min, 55 "C for 1.5 min and 72 "C for 2.0 min, followed by 30 cycles of 94 "C for 1.5 min, 63 "C for 1-5min and 72 "C for 2.0 min. Negative controls were included in each experiment. The entire putative gene was amplified as three overlapping fragments, PrpSe (1626 bp), M14 (1279 bp) and Prp2g (251 bp) (Fig. 1).Oligonucleotide primer pairs pcprot9 with pcprotlo, followed by pcprot6/RI with pcprot4/RI were used in a nested PCR to amplify the 5' fragment, designated PrpSe, of length 1626 bp. The second portion, called M14, spanning 1279 bp of the central region of PRTI, was amplified using a nested PCR with primer pairs pcprot2/RI with pcprotl4/RI, followed by pcprot7/RI with pcprotl2/RI. The third fragment, Prp2g, encompassing the 3' end of the sequence (251 bp), was amplified using oligonucleotide primers pcprotl3/RI and pcprotl4/RI (Fig. 1, Table 1). Five different overlapping regions of the PRTl gene were also amplified, cloned and the DNA sequences determined. The Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 Pneumocystis carinii serine protease (a) HR 731 I I PRO - 0 I * O II I I II * 111 CATALYTIC I I II I v IV o * PROUNERICH P- I I VI sn HR VII I I I I 1 I I 1 H H w Pcr-l8,Pcr-14,Pcr-5,Pcr-3,Pcr-l ,lam4 t-t M I + I pcprotlrnl + . t pcproUrlRI ..................................................................................................................................................................................................................................................,...,.,.,,.,...................,.......................,,...... Fig. 1. Schematic representation of a P. carinii PRTl gene. (a) PRTl domains: HR, hydrophobic region; PRO-, pro-domain; CATALYTIC, catalytic domain; P-, P-domain; PROLINE-RICH, proline-rich region (the box indicates length and sequence ,, H,,z and variation in different copies of PRTI); STR, serine/threonine-rich region; 0, catalytic active site residues D S423; *, potential glycosylation sites; I , conserved cysteine residues. (b) A genomic copy of PRTl(Paga) showing the positions of the seven introns (I-Vll), a cDNA copy of PRTI(73j) and the products of amplification of different regions of PRTl genes. (c) Position of oligonucleotide primers used in the amplification of different regions of PRTl genes. first region amplified with primer pair pcprotl/RI and pcprot3/RI spanned approximately half of the subtilisin-like catalytic domain, the second region amplified with primer pair pcprot2/RI and pcprot4/RI spanned the end of the subtilisinlike catalytic domain and the start of the P-domain, the third region amplified with primer pair pcprot7lRI and pcprot8/RI spanned the P-domain, the fourth region amplified with primer pair 36ex/RI and Pt3/RI spanned the proline-rich domain and the fifth region amplified with primer pair pcprotl3/RI and pcprot 14/RI spanned the C-terminal hydrophobic domain (Fig. 1, Table 1).The sequences Prpla, Prp3a, Prp7a, Prp2c, Prp3c, P r p k , Prptaf2, Prpf4, PrpSf, Prpg3 and Prp5g were amplified from the P. carinii cDNA library, and sequences Pcr19, Pcr-14, Pcr-5, Pcr-3, Pcr-1, Lam-1 and Prpg4 from the P. carinii genomic DNA (Fig. 2). DNA sequence analysis. DNA sequence analysis was per- formed using the dideoxy chain-termination method (Sanger et al., 1977). Sequence data were obtained in full from both strands for all sequences. Analysis of the sequence data was carried out using the University of Wisconsin Genetics Computing Group (UWGCG) Sequence Analysis Software Package, version 8 (Genetics Computer Group, Madison, WI, USA). PFGE. P. carinii f. sp. carinii was isolated from an infected rat lung and the chromosomes were separated by PFGE using a Contour Clamped Homogeneous Electric Field (CHEF) DRII apparatus (Bio-Rad) operated at 4 "C. Electrophoretic separation was achieved using 0*9°/~Seakem agarose gel with initial switching time of 10 s, increasing to a final switching time of 60 s at 180 V for 48 h. A karyotype corresponding to P. carinii f. sp. carinii form 1was observed (Cushion et al., 1993). Southern hybridization. Southern blotting and hybridization were carried out using standard techniques (Sambrook et al., 1989). PFGE blots were hybridized with three probes derived from different domains of the PRTl gene. The product 9/4r was derived from amplification of the 5' end of the PRTI gene with primer pair pcprot9 and pcprot4r/R1, product 2/4 from amplification of the central catalytic region with primer pair Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 2225 E. B. LUGLI, A. G. A L L E N a n d A.E. W A K E F I E L D Table 1. Oligonucleotide primer sequences Primer PrP4r pt3/R1 36ex/RI pctel2 msgterml pcprot 1/RI pcprot2/RI pcprot3/RI pcprot4/RI pcprotS/RI pcprotb/RI pcprot7lRI pcprot8/RI pcprot9 pcprotlO pcprotl2/RI pcprot 13/RI pcprot 14/RI Sequence S’-GCTTGTCACTATTAAACC-3’ 5’-GGGAATTCTGAAGCTTTTCGAGTGGTTG-3’ S’-GCGAATTCTAACCTTCAGCCAGATTCA-3’ S’-AAGTCACGTGCTCTCTTGGTCA-3’ 5’-AATGTTGTTGGGAGTGATTG-3’ S’-GGGAATTCTTATTCTTGTAGCTGGGGAC-3’ S’-GGGAATTCTTCTACACCTCTTGCTGCG-3’ S‘-GGGAATTCCACGCCATGTAAGATTAGGA-3‘ S’-GGGAATTCTAATGCTTAGGATATCCGCG-3’ S’-GGGAATTCATGTGAAATGGTGCCAGTAG-3’ S’-GGGAATTCGTTTTTTTTTAACATTTCATCATG-3’ S’-GGGAATTCTCGGTTATGGAAAACTAGATG-3’ S’-GGGAATTCAAGGTTAGCATCCAGATCTG-3’ 5’-AGAATTTCTAATTAAAAAGTTAAG-3’ S’-AAACACCAACATACTCTAAAC-3’ S’-GGGAATTCTTATAGTACATGAAGCTTTTCG-3‘ S’-GGGAATTCTTCATCTACATCTACGACTTC-3’ S’-GGGAATTCTATAGGTTAAAAAGAGTAACCC-3 pcprot2/RI and pcprot4/R17 and product 13/12 from amplification of the 3’ end of the gene with primer pair pcprotl3/RI and pcprotl2/RI (Fig. 1). The amplification products were gel-purified (Geneclean 11, BiolOl) and labelled with [a32P]dCTP by random priming (Megaprime, Amersham). Hybridization was carried out at 45 “C and stringency washing ~ 0.1 ‘/o SDS. at 60 “C in 0 . 2 SSC, Southern blots of genomic P. carinii DNA digested with restriction endonuclease PstI or BamHI were probed with oligonucleotide probes pcprotS/RI, pcprot5/R17 pctel2 and msgterm (Table 1) labelled with [p3’P]dATP using polynucleotide kinase. Hybridization was carried out at 46 “C and stringency washing at 52 “C in 5 x SSC, 0.5 o/‘ SDS. copy from the cDNA library, PRTl(73j), confirmed the presence of seven short introns in the genomic DNA sequence. The introns ranged in length from 38 bp to 45 bp, with a base composition ranging from 71 to 84 mol YO A T. In all seven introns, the dinucleotide G T was present at the 5’ splice donor site and AG at the 3’ splice acceptor site. The sequence YTRAT, which has been identified as the putative lariat-forming motif in other P. carinii f. sp. carinii introns (Zhang & Stringer, 1993), was present in the first, second, fourth, fifth and seventh intron. The eukaryotic lariat consensus sequence, YYRAY, was identified in the third and sixth intron. RESULTS The sequence of the cDNA clone, PRTl(73j), contained an ORF of 2370 bp, which on translation resulted in a peptide of 790 amino acids (Fig. 2). The deduced amino acid sequence was compared to sequences in the GenBank and EMBL databases and showed homology to fungal and other eukaryotic subtilisin-like serine proteases. The A T content of the ORF was 64 mol YO, with a high A + T content at the third base position of the codons. The base composition of the 5’ upstream sequence was 74 mol YO A T and of the 3’ downstream sequence 75 mol YO A T. A consensus polyadenylation signal, AATAAA, was observed 68 bp downstream of the stop codon. Analysis of DNA and derived amino acid sequence of copies of the PRTl gene We have identified a family of genes in the P. carinii f. sp. carinii genome which shows homology to the subtilisinlike serine proteases. We have named this gene family PRTl (gotease 1). A copy of the PRTl gene (Paga) was isolated from a P. carinii genomic library, the ORF (3069 bp) containing seven short putative intervening sequences. A copy of the PRTl gene (73j) of length 2370 bp was also isolated from a cDNA library. Portions of the gene were amplified by PCR from the cDNA library as three overlapping fragments at the 5’ end (PrpSe), the central region (M14) and the 3’ end (Prp2g). Five other regions of the gene were also amplified, from either the P. carinii cDNA or genomic libraries, to determine the extent of diversity among different copies of the multi-gene family at different domains within the gene sequence. Analysis of the DNA sequence of the copy of the PRTl gene from the genomic library, PRTl(Paga), and of the 2226 + + + + The deduced amino acid sequence of the genomic clone PRTl(Paga), the cDNA clone PRTl(73j), the three fragments obtained by PCR amplification of the cDNA library and the other recombinant clones generated by DNA amplification were compared (Fig. 2). Several regions of homology were found and also a number of regions in which significant divergence was observed. These data suggested that the sequences were derived from different copies of the PRTl multi-gene family. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 Pneumocystis carinii serine protease Comparison of P. carinii PRTl with other subtilisinlike serine proteases The deduced amino acid sequence of the cDNA clone PRTl(73j) was aligned with nine other subtilisin-like serine proteases including fungal, mammalian, insect and nematode sequences (Fig. 3). The P R T l sequences showed homology with all the other sequences, with a high level of homology in the subtilisin-like catalytic domain. The three essential residues of the catalytic active site, aspartic acid (Asp,,,), histidine (His,,,) and serine (Ser,,,) [residues are numbered with reference to PRTl(73j)l were conserved in all the PRTl sequences. The highest levels of homology between all the sequences were around these residues. The structural organization of the fungal sequences showed domains characteristic of this class of endoproteases, a hydrophobic signal sequence, a pro-domain that may be cleaved by autoproteolysis, a subtilisin-like catalytic domain, a P-domain (known as such because it is essential for proteolytic activity), a serine/threoninerich domain which may potentially be modified by 0linked glycosylation, a C-terminal hydrophobic transmembrane domain and a C-terminal tail with acidic residues (Van de Ven & Roebroek, 1993) The P. carinii PRTl sequences showed a similar putative structural organization but unlike the nine other subtilisin-like serine proteases, they also had a proline-rich domain preceding the serine/threonine-rich domain and the Cterminal hydrophobic domain (Fig. 1). The P. carinii PRTl(73j) sequence had a hydrophobic signal sequence at the N-terminus, followed by a putative pro-domain, a subtilisin-like catalytic domain from Ser,,, to His,,,, a P-domain from residue Tyr,,, to Ser,,,, a proline-rich domain from residue Pro,,, to a serine/threonine-rich domain from residues Thr,,, to Ser,,,, and a C-terminal hydrophobic domain from residues His,,, to Phe,,,. Analysis of subtilisin-like catalytic domain The three-dimensional structures of four subtilisin-like serine proteases have been determined, subtilisin BPN’/Novo from Bacillus amyloliquefaciens (Hirono et al., 1984; Bott et al., 1988), subtilisin Carlsberg from Bacillus licheniformis (McPhalen & James, 1988), thermitase from Thermoactinomyces vulgaris (Gros et al., 1989; Teplyakov et al., 1990) and proteinase K from Titirachium album (Betzel et al., 1988). The amino acid sequence of these four proteases has been compared to that of 31 other subtilisin-like serine proteases isolated from bacteria, fungi and higher eukaryotes and the essential core structure of the catalytic domain of this group of molecules has been identified (Siezen et al., 1991). We have compared the deduced amino acid sequence of the P. carinii PRTl(73j) gene with the multiple sequence alignment of the other subtilisin-like serine proteases and have identified, by homology, the three essential residues of the catalytic active site, aspartic acid, histidine and serine, in the PR T l sequence His,,, and Ser,,,). On the basis of the sequence alignment, the P. carinii PR T l sequence could be assigned to the class I subtilases, within the subgroup I-E which contains the pro-hormone processing proteases from yeasts and higher eukaryotes (Siezen et al., 1991). Eight a-helical domains and nine /?-sheet regions have been defined as the structurally conserved regions within the essential core structure. The variable regions which connect the core segments have been found to differ both in length and in amino acid sequence (Siezen et al., 1991). High levels of homology were observed between the PR T l sequences and the other sequences in the regions of the two conserved internal helices, helix C (residues 252-262) and helix F (residues 422-438) [Fig. 3; residues are numbered with reference to the PRTl(73j) sequence]. Eleven amino acid residues have previously been found to be totally conserved in all the characterized subtilisin-like serine proteases, and most but not all are conserved in the PR T l sequences. These amino acid residues are at the active site Asp,,,, His,,, and Ser,,, [found in all the PR T l sequences except PRTl(Prp7a)I and in the internal helices at residues GlY253, Gly,,, and Pro,,,. The residues Ser,,,, Gly,,,, Gly,,,, Gly,,, and Thr,,,, involved in substrate binding, were conserved in all the PR T l sequences except Thr,,,, which was found only in two sequences generated by PCR, PRTl(Prp1a) and PRTl(Prp7a). In addition to the totally conserved residues, seven other amino acid residues have been identified which are highly conserved. Of these, six were conserved in the P. carinii PR T l sequences and included the oxyanion hole residue (Asn,,,), residues GlyZl6 and Thr,,, near the active site, and also residues Gly205, Gly,,, and Gly,,,. Seven conserved cysteine residues were found in all the P. carinii PR T l sequences : c y S 2 5 6 , cys268, Cys,,,, Cys,,,, Cys389, Cys,,, and Cys,,,. Nineteen variable regions, generally located in loops on the surface of the molecule, have been identified in the subtilase family, of which 14 were found in the P . carinii PR T l sequences. Three positions have been identified at which charge is totally conserved in all the subtilisin-like proteases examined, and these were also conserved in the P. carinii PRTl sequences :the positive charge on Arg,,, and the negative charges on residue Asp,,, (active site) and Asp,,,. ~~ Fig. 2. Alignment of the f . carinii PRTl deduced amino acid sequences from the genomic clone Paga, the cDNA clone 73j and the three overlapping PCR products amplified from a cDNA library correspondingt o the 5’ region (PrpSe), the central region (M14) and the 3’ region (Prp2g). The deduced amino acid sequences of PCR products amplified from five different regions of the fRTl gene family were also aligned: the catalytic domain (Prpla, Prp3a, Prp7a); the boundary of the catalytic domain and the P-domain (PrpZc, Prp3c, Prp4c); the P-domain (Prptaf2, Prpf4, Prp5f); the proline-rich region (Pcr-19, Pcr-14, Pcr-5, Pcr-3, Pcr-1, Lam-1); and the C-terminal region (Prpg4, Prpg3, Prp5g). Gaps were introduced t o maximize homology; identical amino acids are boxed. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 2227 E. B. LUGLI, A. G. ALLEN a n d A . E . WAKEFIELD Prpla prp3a prp7a - - - -- - -- - - - - - ----- - --- - - - - - - - - --- -- ~ J ~D GI YY A N sPYT ITIA A I 411 420 413 0 DOE 0 KnFVi S ifit 107 108 107 411 480 473 0 prp2c Prp3c PrpQc ---- -- - - - 132 132 131 -- --- - - 50 50 50 Paga Y G K L D A Y R 73j F G K L D A PrpSe Y G K L D A Y R Ml4'-[Y R prpla . . . . . . . . ~rp7a Prp2c Y G K L D A Prp3c Y G K L D A Prp4c Y G K L D A Prptaf2 - - - - _ A Prpf4 - - - - PrpS f - A n V E K A R T F K Y R ~ ~ E ~ A M V E K A R T F K M VT-E F K T ~ T T L N P T F L N P LNm Q K Q Q T n F S T L N T M F S TM FS T Q L I P L N K K F S E N G G H I T S S F Y I H R G Y P ~ Q T ~ F ~ T Q L I P L N K ~ F S E N G G H I T S ~ F Y I ~ G ~ P T Q L I P L N K K F S E N G G H I T S S F Y I H R G Y P T PQ -L 1 S R RLI T S S F Y I H m G Y P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 105 -- - - - - - 105 - - - - -105 Y K F K 55 55 55 Y R M V E K A R T F K T L N P Q T H F S T Q L I P L N K K F S E N G G H I T S S F Y I H R G Y P K H Y K F K Y Y Y Y R R R R M M M M ~ V V V E E E E K K K K A A A A R R R R T T T T F F F F K K K K T T T T L L L L N N N N P P P P Q Q Q Q T T T T M M M M F F F F S S S S T T T T Q Q Q Q L L L L I I I I P P P P L L L L N N N N K K K K K K K K F F F F S S S S E E E E N N N N G G G G G G G G H H H H I I I I T T T T S S S S S S S S F F F F Y Y Y Y I I I I H H H H R R R R G G G G Y Y Y Y P P P P K K K K ~ H ---_ AYR~VEKAROFKTLNPQTMF~TQLIPLNKKFSENGGHITSS _ _ _ _ _ - - - - - _ _ _ - _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ - - - - - - - - - - - - - - - - - - - - 105 S L E Y V G V S F H Y Q H Q R R G H L E F N I T S P S G V T S V L A H R R N R D K H G G S I L W T F M T V K H W G E S I 115 S L E Y V G V S O H Y P H Q R R G H L E F N I T S P S G V T S V L A H R R N R D K H G G S I L W T F M T V ~ H W G E S I 115 S L E Y V G V S F H Y Q H Q R R G H L E F N I T S P S G V T S V L A H R R N R D K H G G S I L W T F M T V K H W G E S I 115 P ~ ( C - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Fig. 2. For legend see p. 2227. 2228 Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 Pneumocystis carinii serine protease - - - - - - - - - 177 38 .. 28 19 29 35 19 734 693 529 271 177 179 177 98 69 59 66 71 59 Paga 73j PrpSe ni 4 Per-19 Pcr-14 Pcr-5 Pcr-3 Pcr-1 Lam- 1 Paga 73j PrpSe ni4 Pcr-19 Pcr-14 Pcr-5 Pcr-3 Pcr-1 Lam-1 Pm2u P E --- -- - E P E T Q S _______--_-__-_ ---- 858 724 529 394 162 192 165 169 94 138 0 924 790 529 406 171 201 174 178 103 147 63 61 69 62 ..................................,........................................................................................,..................................................................................................................................................................................... Fig. 2 (cont.) For legend see p. 2227. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 2229 E. B. LUGLI, A. G. A L L E N a n d A.E. W A K E F I E L D It has been proposed that the high specificity of the class I-E subtilisin-like serine proteases for paired basic residues Lys-Arg or Arg-Arg may be facilitated by a high density of negative charge at the substrate-binding face, provided by nine highly conserved Asp residues and one Glu residue (Siezen et al., 1991). T w o of the Asp residues, Asp,,, and Asp,,,, were found in all the P. carinii PRTl sequences and also the Glu,,, residue. In addition, four other Asp residues were found in some but not all of the copies of PRT1. Analysis of the domains flanking the subtilisin-like catalytic domain The putative domains of the PRTl(73j) polypeptide are summarized in Fig. 1. A hydrophobicity plot of the PRTl(73j) sequence revealed a hydrophobic region at the N-terminus, suggesting that this may be a signal sequence. Residues 1-23 of the N-terminus of the sequence showed a high level of homology to the Nterminus of the P. carinii f. sp. carinii multifunctional folic acid synthesis fas gene which encodes dihydroneopterin aldolase, hydroxymethyldihydropterin pyrophosphokinase and dihydropteroate synthase (Volpe et al., 1992, 1993). This region was followed by the presumptive pro-domain, which may be cleaved by autocatalysis. Potential autocatalytic sites of paired basic residues were identified in the P R T l (Paga) and PRTl (Prp5e) sequences at Ly~,,~-Arg,,, and Arg13,-Arg13,, but were absent in the PRTl(73j) sequence. Five other semi-conserved autocatalytic sites were found in some, but not all, copies of the P. carinii PRTl sequences, two in the catalytic domain (Lys,,,-Arg,,, ,Arg4,,-Arg,,,) and three in the P-domain (Arg521-Argjr22, Arg555 Or Lys555-Arg556, Arg576-Arg577) * One potential autocatalytic site at the start of the Cterminal hydrophobic region (Lys,,,-Arg,,,) was found in all the sequences. The PRTl(73j) sequence contained two of the potential autocatalytic sites, Arg5,,-Arg5,, and Lys769-Arg770s The PRTl sequences showed homology with the other subtilisin-like serine proteases in the region of the Pdomain, the highest homology being with the derived amino acid sequence of the Schis. pombe krp gene. Four potential sites for N-linked glycosylation were observed in all the PRTl sequences, three in the subtilisin-like catalytic domain (Asn,,,, Asn,,,, Asn,,,), and one in the P-domain (Asn,,,). A serine/threonine-rich region was also identified in the PRTl(73j) sequence from residue Thr,,, to Ser765,and the hydrophobicity plot of the PRTl(73j) sequence revealed a hydrophobic region at the C-terminal end, residues His,,,-Phe,,,, suggesting a membrane-associated domain. Unlike most other serine protease sequences, however, all the copies of the P R T l polypeptide contained a proline-rich region downstream of the P-domain. 2230 Genetic organization of the PRTl multi-gene family Analysis of the alignments of the DNA and the deduced amino acid sequences of copies of the PRTI gene from genomic DNA, the cDNA sequence and the three fragments obtained by PCR of the cDNA library revealed domains in the PRTI gene which were highly conserved and also regions where significant divergence was observed, again suggesting that PRTI comprises a multi-gene family (Fig. 2). The subtilisin-like catalytic domain and the P-domain appeared to be conserved whereas high levels of heterogeneity were observed in the proline-rich domain and the C-terminal domain. The variation in this region was both in length and in sequence. A number of repeated DNA sequence motifs were found in the proline-rich region. Nucleotide sequences encoding polyproline were found in all the sequences, and also the dipeptides Pro-Glu and Pro-Gln and the tetrapeptides Pro-Glu-Pro-Gln and Pro-GluThr-Gln. The order and number of tandem repeats varied in each sequence. The overall length of this region varied from 67 amino acid residues in the shortest sequence, PRT1(73j), to 233 residues in the longest sequence, PRTl(M14). T o further substantiate the presence within the P. carinii genome of multiple copies of the PRTI gene, P. carinii f. sp. carinii chromosomes were separated by PFGE and a karyotype corresponding to P. carinii f. sp. carinii form 1 was observed (Cushion et al., 1993). Under the conditions used, the rat chromosomes were too large to be resolved and remained at the band of limiting mobility. The separated chromosomes were analysed by hybridization with three probes derived from different domains of PRTI. All three probes showed similar patterns of hybridization, annealing at high stringency to all the chromosome bands except for one, the third smallest in size, approximately 350 kb (Fig. 4). This provided further evidence that the P. carinii f. sp. carinii genome contained many copies of the PRTI gene, which were present on most of the P. carinii f. sp. carinii chromosomes. The sequences of the PRTZ gene family showed high levels of homology with ORF3, an ORF which was reported to encode a P. carinii protein of unknown function and was demonstrated to be contiguous with a copy of the gene encoding the major surface glycoprotein MSGIOO (Wada & Nakamura, 1994). This gene arrangement was reported in 15 other Iz clones, in which a gene showing high homology to ORF3 was located downstream of a copy of MSG (Wada & Nakamura, 1994). Most copies of the MSG genes have been demonstrated to be located in the P. carinii f. sp. carinii subtelomeric regions (Underwood et al., 1996 ; Sunkin & Stringer, 1996). The copy of the PRTZ gene encoded by the PRTl(Paga) sequence was cloned from a IzEMBL3 genomic library as a single 14 kb fragment and was approximately 1150 bp downstream of a copy of MSG. Four other Iz clones isolated from the same library contained a copy of PRTI contiguous with a copy of MSG. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 Pneumocystis carinii serine protease P. carinii f. sp. carinii genomic DNA was digested with either restriction endonuclease PstI or BamHI and probed sequentially with four oligonucleotide probes, derived from the 5’ end of PRTl gene (pcprotS/RI), from the catalytic domain of the gene (pcprot3/RI), an M S G probe (msgterm) and a subtelomeric probe (Pctel2). All probes hybridized to multiple bands. The hybridization patterns of some of the bands, ranging in size from 7 kb to greater than 12 kb, were the same for all four probes. However, hybridization to other fragments was not coincident, with the PRTl probes alone hybridizing to some high-molecular-mass fragments and also low-molecular-mass fragments of less than 7 kb (Fig. 5). DISCUSSION We describe the cloning and characterization of copies of the PRTI multi-gene family from P. carinii f. sp. carinii. A copy of the PRTl gene was isolated from a P. carinii f. sp. carinii genomic library. A different copy was isolated from a cDNA library, indicating that this copy of the gene was transcribed, and also identifying the presence of seven short introns in the genomic sequence. Consistent with many other P. carinii genes, the coding region and the flanking sequences of the PRTl sequences showed a strong bias for adenine or thymine, and in particular at the third base position of the codons. Similarly, the presence of short A+T-rich introns has been reported in other P. carinii genes. In the PRTl sequences, the introns were not distributed throughout the gene, but six of the seven introns were found in the subtilisin-like catalytic domain and the seventh in the P-domain. It is possible that the introns may play a role in restricting the variation in this region of the gene, whereas no introns were observed in the highly heterogeneous proline-rich region. The high level of homology of the P. carinii PRTl sequences to subtilisin-like serine proteases, and in particular in the region of the catalytic domain, strongly suggested that this gene encoded a protease of this type. The predicted P. carinii PRTl polypeptide sequences possessed the three essential residues of the catalytic active site as well as many other highly conserved motifs. The domain organization of the PRTl gene strongly resembled that of the fungal prohormoneprocessing proteases, with the exception of the prolinerich domain. This proline-rich region is very uncommon in the subtilisin-like serine protease superfamily, although the KRP6 gene from Y . lipolytica is reported to contain a short region of a tetrapeptide repeat, the consensus sequence of the four amino acids being Glu- (Asp/Glu)-Lys-Pro (Enderlin & Ogrydziak, 1994). A proline-rich region has also been found in the C-terminal tail domain of the mammalian serine protease acrosin, a proteolytic enzyme of sperm cells, located in the acrosome at the apical end of the spermatozoan (Klemm et al., 1991). In the African trypanosome, Trypanosoma brucei, a proline-rich domain has been identified in the procyclic acidic repetitive proteins (PARPs). These proteins are found on the cell surface of the insect form of the parasite and are encoded by a family of polymorphic genes which contain a variable region with heterogeneity in both length and sequence. The variable region contains the proline-rich domain and is primarily composed of the dipeptide Glu-Pro (Roditi et al., 1989). Unlike any of the other fungal prohormone-processing proteases, which appear to be single-copy genes, the data reported in this study suggested that the PRTl sequence is present in many copies, which are similar but not identical, in the genome of P. carinii f. sp. carinii. The relatively large number of recombinants present in both the genomic and the cDNA libraries suggested a multi-copy gene and this was substantiated by PFGE data revealing that at least one copy of a PRTl gene was present on all but one of the P. carinii chromosomes. Southern hybridization of restriction endonucleolytic digests of P. carinii f. sp. carinii DNA probed with PRTl sequences also confirmed the presence of many copies of the gene. Analysis of sequence data generated by the amplification of the locus showed heterogeneity, suggesting that a variety of different copies of the gene were present in the P. carinii genome. Some domains, including the subtilisin-like catalytic domain and the Pdomain, were highly conserved between gene copies, whereas the highest levels of divergence were observed in the proline-rich domain, which varied both in length and in sequence. Of five genomic clones analysed in this study, all possessed a copy of PRTl contiguous with an M S G gene. It has been reported that 15 independent genomic clones which encoded M S G were contiguous with the ORF3 sequence, which from our analysis appears to encode the proline-rich domain of PRT2 (Wada & Nakamura, 1994). It has been demonstrated that most copies of MSG are subtelomeric (Underwood et al., 1996; Sunkin & Stringer, 1996). It is therefore highly likely that many copies of the PRTl multi-gene family are located in the subtelomeric regions of the P. carinii f. sp. carinii genome. However, PFGE analysis has shown that not every P. carinii f. sp. carinii chromosome contains a copy of PRTl ,and the preliminary characterization of a Fig. 3. Alignment of the P. carinii PRTl(73j) deduced amino acid sequence with subtilisin-like serine proteases from Schiz. pornbe (spkrp), Y, lipolytica (ylipo), Kluveromyces lactis (klkexl), Sacch. cerevisiae (sckex2), C. elegans (celeg), 0. melanogaster (furs-d), rat (ratkex) and human (furin and nec2-h). GenBank accession numbers are X82435, 540697, X07038, M24201, L29438, P26016, M83745, X17094 and P16519, respectively. Residues are numbered with reference to the P. carinii PRTl(73j) sequence. Gaps were introduced to maximize homology. -, catalytic domain; ---, P-domain; > > > and ---,predicted conserved secondary structure elements, a-helix (hA-hH) and P-sheet Wl-pS), respectively; conserved catalytic active site residues, Asp,,, (D), His,,, (H) and Ser,,, ( 5 ) ; *, conserved residues in the catalytic domain. +, Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 2231 E. B. LUGLI, A. G. A L L E N a n d A.E. W A K E F I E L D 73 j SPkrp YliPO klkexl sckex2 celeg furs-d ratkex furin nec2-h ................................................................................ ..................................................................... C IERPST#HPA .......... .MLRXPILOL LLASQAVAQL PHlCEMYDSR VYVALSLRDQ LDPREWASV SOLDHQQWTP EHPVGTIPNT YSLIFRTIVM ILSSQLMLU IAVSQYGKAM QvPKKDHENa QYPAIESYDD V-L..... ..AEHSDWSF EHDMGLANH . m Y I T L CRIWAF.... .STSALVSSQ Q I P W m S R QYPAVESNET LSRLE..... . . m y EHDVRQLPNH RV .............................................................................. ................................................................................ ..................................................................... S NPEPGAVCRV ............................................................ .AGTWBPRPY ................................................................................ DgAlcETQAw .......................... 54 ILLTPPLYWI YLVRVRCEMV PVDFENM)YY YYPHLSEDVD IEEPSBAVOF VEPE VDPWAAEAI GAKYVRPLLN L . K m I K L ElELnarmEKLQ XKY-EEDV EKRLVALERL DYDWSERGLG SLEVLSERRI El......... DAI DTOYSENIID pHDL.PPVQL RS........ SLEEL QGDNNDHILS VHDLPPILNOL TIEHDSICDE SI.....OAC GEPIBTVIRL AKRD.DE..L ARRIAADHwd 73j OPkrp YliPO klkexl sckex2 celeg furs-d ratkex furin nec2-h YIFK LLCGPILAIP LQPLVSSCSP LENDDLPLVQ YVFSAPKEYA PIBNIRDQDR L E V A Q G W YVFSKPLQ.. SLOK WPSKELL.. KLGK bfRISIGRIAW QILAV..... LIAVAP 73j .Pkrp YliPO klkexl sckex2 celeg furs-d ratkex furin nec2-h KygMltvogLD 73 j SPkrp YliPO klkexl sckex2 celeg furs-d ratkex furin nec2-h LYNDEEIVNN KRGIDAGILE DASPAMPVQW LPIQ. .LPVPA RRAQRQQPQS 73j SPkrp YliPO klkexl eckex2 celeg furs-d ratkex furin nec2-h TVAIADNQLD YTNXDLAPNY NSQGSYDFVS KTDDPNPK.. TVAFVDWID PKEPDLQAAY TSLGSWDFND NIADPLPK.. VTAVVDDGLD - D I M PAEaswDFNF NKSDPKPS.. VAALVDWLD YENEDLKDNF CVEWWDFND "PLPKPR.. V M I M W L D Y E N E D W N F CAEOSWDFND "LPKPR.. SVSILDWIQ RDHPDLAANY DPLASTDIND HDDDPTPQNN W T I L D W L E SDEPDIQDNY DPKhSYDVNS HDDDPNPEYD VITVLDWLE WNHTDIYANY DPEASYDFND NDHDPPPRYD W S I L D W I E lWHPDLAONY DPQASFDVND QDPDPQPRYT TIGIHDDOID YLEPDLASNY NAEASYDPSS NDPYPYPRYT PI @2 73j 'Pkrp ylipo klkexl sckex2 celeg furs-d ratkex furin nec2-h PSO.LSYELE SLALSYKPNV NYIYSCSWOP PGDGYMIPM YPTTYSAIIK GImORNGLG SIyvpoT13Na S A P . ITDAVE SEALNYGPQT NHIYSCSWGP ADDORAMDAP NTATRRALBW GVLNQLLNQLO SIPWASSKE.IAEDIE ALA-K NDIYSCSWGP PDNUQTluBP GKVVKDAMVN AITNURQOKG NVPWASQNG SQQ.ITAEDE AASLIYGLDV NDIYSCSWGP SDWKTMQAP DTLVKKAIIK GVTEGRDAKG A L Y W A S m SGD.ITTEDE AASLIYGLDV NDIYSCSWGP ADW-P SDLVKXALVK GVTBQRDSKO AIYWASDGA.VSDSVE MSLSLNQDH IDIYSASWOP E D W K T F W P O P G I m R Q G K O NIFVWASGNG DGD.VTDAVE ARSLSXNPQH IDIYSASWGP DDDOKTVWP GELASRAFIE GTTKGRWKG SIPIKASWI.VTDAIE ASSIOPNWH VDIYSASWOP NDWKTVEGP GRLAQKAFEY UVKQGRQGKG SIFVWASWE-VTDAVE ARSLGLNPNB IHIYSASWGP EDDQKTVWP ARLAEEAFPR GVSQORWLG S I F V W A S m DQPPMTDIIE ASSIS-L IDIYSASWGP TDNQKTVDQP RDVTLQAMAD GVNKORGGKG SIYVWASODG -83 > > > > > > > > > > > > > >----h D P4 >>>>>>>>>>>>hE P5 ................ ................ ................. ................ ............... .... ................................................................................ ERSSLAGVFL ILLLPSQPPL L m Q R m TLQCTAPTLP CWOCALNSVK AKRQFVNEWA AEIH.QGPBA ASAIAEELGY SQPPVP-L RPWLL..... . . . . w W M . ...TGTLVLL M . ....DAQ QQKVFTNTWA VRIP.WPAV A N S V m G F ......................... mQGCVSQWKMAOP LFCVHVFASA ERPVFTNKFL VELHllOaEDK ARQVAAEHGF 132 NHHIPPIEKQ VLEDEIK..E KIENYFSLEK GELNAIWPNS DKLPYYEKQK LVKPVNROAI RDDIYFDNQD 0 SDDSVQSSIR HlcILApVNWTE E-YLICEIK RUEEAQMQDDKGDKKBDQ KDDKKBQQBA QKBQDKEDNK G D D W W E E D DDDDEDEDDD ........................................................... HKR....... ...................................................................... FKR....... ...................................................................... EVXGDP..PL DTHYPLYHSE TTRTRRXKRA IVERLDSHPA VEWVEEQRPK KRVKRDYILL DND-NPP RRSVLNRDOT ................................................................................ DLLQQIOSLE NEYLFlcHlwa PRRSRRSALH ITKRLSDDDR VIWAEQQYEK ERRXRS.... .................... LNLoQI..FG DY-OV TKRSLSPHRP RHsRLQREPQ VQWLEQQVAX RRTKRD.... .................... GVR.KLPPAE GLYHFYBNOL AKNCRRRSLH EnrQQLERDPR m E G P DRKKRGYRDI .................... HII........ *208 VVlCDPTVDQA KKSTEDLKRE LERQTPRWRY XRDASESDEL KPVDESMYQG HPDDSLYDVY DSSI(EQ1 QNARILP... PPWDSSLLPV KEABDKLPAEIPSL... VKEIKKELOI LNEPSNEFOI RKYYPDEVGI ........NI .SI PF SF LF EP ..... ... .... ......... .................. .......... . . . . . A M . . . .................. .........V PRDSALN.. ................... ..............WQ... .................. ............NEIDI... .......... ........N# SDPCFDKQWY SDPLPYQQWE KDPSLWKQWY SDPLPDQQWE NDPLFERQWE PDPLYKVQWY NDSKWPQMWY LPNTEKPQ.. IPNSNNPGH. H . LINPNY-. LVNPSFPQS. LHWAVQG.. ..YDlQWRQA WLQGYAGRNV LNR..WQ.. ..LDWIPA WlR(a1TGKGV NDPMWNQQWY LQDTRMTASL PKLDLEVIPV WQKOITQKGV TDPKFPQQWY LSQ..VTQ.. ..RDLNVKAA WAQGYTQHOI NDPLPTKQWY LINTQQAWT POLDLNVAEA WELGYTGKOV >>>>>>>>>hA >>>>>hB - - - - * * +** _--------* ------ ..VDINVTQV WLQOITOKGV ...DLHLIIEV WDAGYFOENV ...DLNVTGL m O V ...DvNvT(;IL WKENITQYGV ... D m L WYNNITGAQV .SSSDTEOTR .LSDDQHOTR SHDDYHGTR LXDDYHGTR .LSDDYHGTR .QDNAHoTR * * * a 4 CAOEVAMR. NDPCGLGVAY ESNISQLRFL CAGEVAAAW. NDVCGVGIAP RAKVAGLIIIL CAOEIAAVR. NNVCQVOVAY DSKVAGIRIL CAOEIAAFR NDICOVOVAY NSKVSGIRIL CAQEIAAKKG "FCGVQVGY NAKISQIRIL CAOEVAALAO W C G V G V A P ICMIQQVRML MTDSNREGTR CAGEVMTAN NSFCAVGIAY OAsV~VEudL CAOEIAMQAN -CQVGVAY NSKVQGIRML P-GTR m N R E G T R CAGEVAAVAN NGVCGVGVAY NARIQGVRML DDWFNSHGTR CAGEVSAAAN NNICGVOVAY NSKVAOIRHL >>>>>>>>>>>>hC . . . -------- ** ** ------ 08L.D- GHYHD"FD OSRGDNCNPD OldWDSCNPD GTRGDNCNYI) OSSQDSCSAD OREQDNCNCD GRQGDNCDCD QRBHDSCNCD OSYDD.CNCD ................................................................................................................................................................................................................................................................1 Fig. 3. For legend see p. 2231. 2232 Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 *362 Pneumocystis carinii serine protease 73 j SPkrp ylipo klkexl sckex2 celeg furs-d ratkex furin nec2-h us * * * * GYANSPYTIT IAAIDSEDIW PYFSESCPCI LASTYSQGEN OSIYTTDLQ. ......KEQC TTEHTQASAS TPLMGIIAL QYTNSIPSAT IQAVDAEmI PPYSEVCAAQ LVSAYSSQSH LSILTTNPEO T..... C T R S H W T S M APLASAWAL OYTNSIYSIT VGALDPNWH PYYSEACSAN M Y S S G S E HYIVOTDINA IDDKSAAPRC Q N Q H W T S M APLMQWAL OYTNSIPSIT VMIDWXGLH PPYSESCSAV HVVTYSSQSQ NYIKTTDLDE .KC S N T H W T S M APLMQIYTL OYTNSIYSIT IGAIDEKDLH PPYSEOCSAV MAVTYSSOSO EYIHSSDRC SNSHWTBAZL A P L M O W T L FRQPAI. VT-VDVPaQC TDICBTaTSAS APLMOIIAL GYTTSVYTLS ISSATYDNHIl PWYLEECPSS IATTYSSAD. QOEKQV. l7T.TD-C TVSETGTSAS APLM(IIAAL OYTNSIWTLS ISSATEEQHV PWYSEKCSST LATTYSSQG. OYTDSIYTIS ISSASgQoLS PWYAEKCSST LATSYSSQD. YTDQRI. TTSADLHM)C TETETGTSAS APLAAQIFAL GYTNSIYTLS I S S A T Q W PWYSEACSST LATTYSSCIN. QNEKQI. VT.TDLRQKC TESHTQTSAS APLMQIIAL GYASSUWTIS INSAINWRT ALYDESCSST LASTFSmR. IWNPEAQ VATTDLYCINC TL6LHSQTSM APEAAQVPAL - - - - - - - _ _P6 87 ------ $8 ----$9 >>>>>>>>>>>>>> ... ... ... ... ... ... ....... ........ a _ - - - 73j VLSANPNLTW BPkrp ALSIRPDLSW YliPO ALSVRPDLTW klkexl VLEANPNLTW a c kex2 LLEA"LTW celeg ALEANPELTW furs-d VLQSNQNLTW ratkex A L g A " L T W furin TLEANlWLTW nec2-h ALEANLQLTW >>>hF 73 j gpkrp ylipo klkexl sckex2 celeg furs-d ratkex furin nec2-h -- -----------------S! HDVQALIVET AVPFNL EY PQWDKLPSER RYSNNWWA LDAYRMVERA KTFX'ILNILQT RDIQHITVYS ASPFDSPSQN AEWQKTPAQP QPSHHWPOK LDMllwEVA XDWQVVNPQT RDMQYLALYS AMINS-NDD O-WQDTASQQRpBgQWYQK LDASKIMLA E m m T RDVQYLSILS 9EEINP.H.D OKWQDTMOK RYSETYQWK L D A Y N I m K9WI"PQQ RDVQYLSILS AVOLEA-NAD Q D W R D S W K KYSHRYQWK 1 1 K Q T RDNQELVLRT ANWlWLE". PWSRNGVQR MVSNKFGYQL IDaQALVNMA KTWKTVPEQ. RDLQHIWRT AKPANL-RD. PSWSRNGVQR RVSHSPOYGlL I(DIuE16vRVA RDMQELWWT SEYDPLA". WpllcIcNaAoL MVNSRWPOL LNAAIUVDLA DPRTWRNVP. RDMQELWQT SKPAHL-NA. NDWATNQVQR KVSESYQYGL L D A W U QNWTTVAPQ. RDMQHLTVLT SKRNQLHDEV HQWRRNGVQL EPNBLWYQV LDAWWVlQ4A KDWXTVPER. >>>>>>>>>>hG >>a>>- WSTQLIPLN WLIAPEI" SPESEVKTVS WLYLPTIVEK WFYLPTLYVS .HICTYEYRL .................... .EXKECIIKD .RKCIIDI.. .FEC.. .VoO 587 ---------------------------------------ATPSENG. .. .GHITSTFYI DSOSPTEYNF IWLEYVQVSF EYKEQYKQHL EFNITSPSGV TSVLAHBBIN DYNS..QTFH KSSWSVNNET QAvlrY.NE.. QSISNSDE.. QSTNSTEE.. ANPNPRPIVO ITEMVSEBTV TKDMIEKSNF ..PLKSVITV TRDDLDKVNF ..VIESTVSV SAEEFXQIWL ..TLESVITI SEKSLQDANF RPQLNFTLDV NGCES.GTPV KRLEHvTvEtV CIPFNRRQAL EILLESPSQI W H I T A V L NLEASYRQHV RVLIXQPRQV KRLEHVTvm DIDAPYRQHV LVDLISPDOV KRIEgvTVTV DIDTEIROTT TVDLISPAGI LYLEHVQVEA TVRYLKRQDL KLTLFSPSOT RSILASERPY VSELMLRRD TSTLATARRL ISNLGWRPR RSVLLPPRPQ DKNISA.QFLD DRSKD.OYDN DKNRY.QFQN DVSSE.OFXD DFNAN.OFEK ................................................................................ NNFEPRALKA NQEVIVEIPT RACEQQENAI NSLEHVQFEA TIAYSRRQDL HVTLTSMQT STVLLAERER DTSPN.GPIW .LTEPKDIGK RLEVRKTVTA CLQEP.N.HI TRLE-QARL TLSYNRRQDL AIHLVSPMOT RSTLLAARPH DYSAD.QFND KSILLSRRPR DDDSKVGFDK SVQDPEKIPS TGKLVLTLTT DACEGKENFV RYLEHVQAVI TVNATRRQDL -SPMOT ----------------------- 665 73 j WPFTTVKEWG ETIVGNWTID VEDEKVS.NL DGEIPDWQLH FFGESIDSSK AELHPPYPFK PQP.PSKPAP PSKPDPNPPS EPPEQWKLL DEOXODWELT SSOVOSWKLK ENGVGDWXIK EDPRQTWLLW V N D R S O O . ~ EQTFENWQLA LWOESEN.. VEN.TOE.QD QVELVNWQLN VFGEQICD- ......PSNTA PLPYDTLELP W M I Y S E SPkrp ylipo klkexl 8ckex2 celeg furs-d ratkex furin nec2-h WTFMTVQRWA WAFMSVAEWA WTFMSVAEWG WTFMSVAEWG WPPLSVQQWG 73 j SPkrp ylipo klkexl sckex2 celeg furs-d ratkex furin nec2-h DPSSQQDSDT SLSSTPTSTS SSKLSPPPTP QPKPEPQPEQ KPTSUSSTT STNLIPPAPT SSSSKTKTST TRKASSTTKT PNSDLTNSST LLSPTSTSFT SYTVSATATP TSTSHIPIPT VLPPTQPVLE P EGNKEDDKGD QKEDKPEDKP EDKPEDTPED KPEDKPEDAP EDKPSDEXXP EEltPEEKPVD NSDSSSDSSD SHTSWWPDLS TTTPTA QTSSFTTTSO EETSGANKLP R SQYSASSTSI SISATSTSSI SIQVETSAIP QTTTASTDPD SDPNTPKKLS S STBPSDTNEH R Y P P R R Q m YFLWCSQLH RA........ 73j SPkrp YliPO klkexl Baked2 celeg furs-d ratkex furin nec2-h STRPSPTEQT FTGSQCSELS PFEKRELLLQ MILLLPFPLF MYSF..... SYREIVAFIT PPLLPAPIFV AVIWTIQISA. ..PWKAKAPP PLSQQEIA.. SODNDRLYDP APSMSFEFD LIPHDDSDDD FVYPEDSKKSAWLYGA VLLVQGPIAV IQIYACVTRR " R S K D PEQAAQLYLA IFVIOAIVII IYYLFPLKSR RIIRRSR... ...AEILyEFD IIDTDSEYDA SINICtQSLYL V K . . . . . . . . PRQAWHYFLT IPLIGATPLV LYFMFFMKSR R R I R R S R . . . AETYEFD IIDTDSEYDS TLDNQTSOIT EPEEVEDPDP KNlcEQESAPE DENKEQElCEO EKKPEDENKE VKS.THD.NE IVTIZSWRLK WETID... .....AAIult VISYQNDlcED AEVASTESA. SSKTE TPVPQNDlCBE VEPAATESTV VKT.TEN.QH RIDFHSWRm LPOESID... VESVTTNPAA TQTPHDWTLL LYQTADP... AQSOD PWSA..TPA TSQGVLSRGS ..... ..... ................................................................................ WDFMSVHTWG ENPVGTWTLK VTDMSQBMQN EQRIVNsPXLI LEQTSSQ... .....PEHWA QPRVYTSnsT V W m Q V E K WApedTTHSWD EDPSOEVWL& 1EN.TSEANN YQTLTKFTLV LYQTAPE... .....OLPVP PESSOCKTLT SSQACWCEE WPFMTTHTWG EDARQTWTLE MFVGSAPQK .QVLXEWTU LHQTQSA.. . . . ...PYIDQ WRDYQSlrtA I(IS1CICEELEEE 745 ............................. ........................ ............................. ............................. ........................................ ................................................................................ MV"EEKPT QNS...L LLVPIMSSSS " SVEDRRD.. .................EQVQ QAPSRWLRL LQSAFSleJTP GPSLHQKSCV QHCPWFAPQ VLDTHYSTEN DVETIRASVC APCHASCATC QGPUTDCLS CPSHASLDPV EQTCSRQSQS WEAVERSLK SIL".... ............................................................ .......... .................... .......... .................... ... .................................................. .................................................. 790 .......... .................... .......... .................... ....... .......... .................... LQw(DILNE IW. SKQSSAIPSA KLSVPYEOLY EALL-S QLEDSEDSLY SDYVDWYNT KP-DRL SRESPPQQQP PRLPPEVEAG QRLRAQLLPS HLPEWAQLS W I V L V F V T WLVLQLRSO PSFRcrmrvYT D R Q L I S Y K G .................................................. Fig. 3 (cont.) For legend see p. 2231. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 ... 2233 E. B. L U G L I , A. G. A L L E N a n d A.E. W A K E F I E L D Fig. 4. Southern hybridization analysis of P. carinii chromosomes separated by PFGE and probed with three different regions of the PRTl(Paga) gene. Lanes: 1, 3, 5 and 7, Sacch. cerevisiae DNA; 2, 4, 6 and 8, P. carinii DNA; 1 and 2: ethidium-bromide-stained gel; 3 and 4, probed with the 5‘ end of the PRTl(Paga) gene (9/4r product); 5 and 6, probed with part of the catalytic domain (U4product); 7 and 8, probed with the 3’ end of the gene (13112 product). The arrow indicates the P. carinii chromosome which did not hybridize to the PRTl probes. clone of one of the subtelomeric regions of P. carinii f. sp. carinii has not revealed a copy of PRTl (Underwood & Wakefield, unpublished results). Hybridization of MSG and subtelomeric probes to endonuclease-digested P. carinii f. sp. carinii DNA resulted in positive hybridization to fragments greater than approximately 7 kb in size. Probes derived from the PRTl sequence hybridized to these bands but also to low-molecularmass fragments, again suggesting that not all copies of PRTl are subtelomeric. The P. carinii PRTl gene family shows some striking similarities to that of MSG. Both are composed of many genes, copies of which are found on most P. carinii chromosomes and show sequence heterogeneity. Some copies of PRTl are contiguous with MSG and are located in the subtelomeric regions of the P. carinii chromosomes. It is interesting to note that one of the major components of the cell surface of Leishmania has proteolytic activity. The Leishmania major surface protease (msp or g p 6 3 ) , a zinc endoprotease, is found in all species of Leishmania and is encoded by a family of genes, some of which are tandemly arrayed (Bouvier et al., 1989; Webb et al., 1991). Expression of different copies of the gene is regulated during the development of the parasite and different isoforms of the protein are found in the 2234 Fig. 5. Southern hybridization analysis of P. carinii genomic DNA digested with restriction endonucleases and probed with oligonucleotides designed to hybridize to the PRTl sequences. Lanes: 1 and 3, P. carinii DNA digested with Pstl; 2 and 4, P. carinii DNA digested with BamHI; 1 and 2, probed with an oligonucleotide designed to hybridize to the 5’ end (pcprotYRI); 3 and 4, probed with an oligonucleotide designed to hybridize to the catalytic domain (pcprotYRI). promastigote stage in the gut of the sand fly and in the amastigote stage in the phagolysosomes of the macrophages (Frommel et al., 1990; Roberts et al., 1995; Ramamoorthy et al., 1995).The major surface protease is thought to play an important role in the virulence of Leishmania by involvement in the degradation of components of the extracellular matrix and by facilitating promastigote attachment to host macrophages (McMaster et al., 1994). Immunization with MSP protein confers partial protection of mice against Leishmania infection (Abdelhak et al., 1995). The proteins encoded by the P. carinii PRTl gene family show highest homology to subtilisin-like serine proteases. A wide diversity of different types of precursor proteins are processed by this family of proteases to mature and active regulatory proteins, but the precise function of many of these proteases has not yet been determined. Some of the fungal homologues have been shown to function in the processing of several proteins, such as the Sacch. cerevisiae K E X 2 gene product which processes both the pheromone a-factor and the killer toxin (Fuller et al., 1989). The krp gene product from Schiz. pombe, which cleaves the pheromone precursor pro-P-factor to its active form, is thought to also Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 Pneumocystis carinii serine protease function in the processing of other regulatory proteins since its activity is essential for cell viability (Davey et al., 1994). The XPR6 gene product from Y . lipolytica, although not essential for cell viability, when disrupted was found to cause aberrant growth and morphology (Enderlin & Ogrydziak, 1994). The function of the products of the P. carinii PRTl gene family is not yet understood but they are likely to play an important role in the life cycle and possibly also the pathogenicity of the organism. life stages of Leishmania. Mol Biochem Parasitol 38, 25-32. Fuller, R. S., Brake, A. & Thorner, J. (1989). Yeast prohormone processing enzyme (KEXZ gene product) is a Ca2+-dependent serine protease. Proc Natl Acad Sci USA 86, 1434-1438. Garbe, T. R. & Stringer, J. R. (1994). Molecular characterization of clustered variants of genes encoding major surface antigens of human Pneumocystis carinii. lnfect lmmun 62,3092-3 101. Gigliotti, F. (1992). Host species-specific antigenic variation of a mannosylated surface glycoprotein of Pneumocystis carinii. ] lnfect Dis 165,329-336. Gros, ACKNOWLEDGEMENTS This research was supported by the Medical Research Council (E. B. L.) and the Royal Society (A. E. W.), and formed a part of the European Concerted Action, Biomed I, PL941118. We should like to thank D r C. J. Delves and D r F. Volpe, Glaxo Wellcome Research, for the gift of the P. carinii cDNA library and the MSG clone, and the Oxford Pneumocystis Research Group for helpful discussion. REFERENCES Abdelhak, Louzir, H., Timm, J., Blel, L., Banlasfar, Lagranderie, M., Gheorghiu, M., Dellagi, K. & Gicquel, B. (1995). 5.8 Frommel, T. O., Button, L. L.8 Fujikura, Y. & McMaster, W. R. (1990). The major surface glycoprotein (GP63) is present in both z.8 Recombinant BCG expressing the leishmania surface antigen Gp63 induces protective immunity against Leishmania major infection in BALB/c mice. Microbiology 141, 1585-1592. Banerji, S., Wakefield, A. E., Allen, A. G., Maskell, D. J., Peters, S. E. & Hopkin, J. M. (1993). The cloning and characterization of the mom gene of Pneumocystis carinii. J Gen Microbiol 139, 2901-29 14. Betzel, C., Pal, G. P. & Saenger, W. (1988). Three-dimensional structure of proteinase K at 0.15 nm resolution. Eur J Biochem 178, 155-171. Bott, R., Ultsch, M., Kossiakoff, A,, Graycar, T., Katz, B. & Power, 5. (1988). The three-dimensipnal structure of Bacillus amyl- oliquefaciens subtilisin at 1-8 A and an analysis of the structural consequences of peroxide inactivation. J Biol Chem 263, 7895-7906. Bouvier, J., Bordier, C., Vogel, H., Reichelt, R. & Etges, R. (1989). Characterization of the promastigote surface protease of Leishmania as a membrane-bound zinc endopeptidase. Mol Biochem Parasitol37, 235-246. Cushion, M. T., Kaselis, M., Stringer, 5. L. & Stringer, J. R. (1993). Genetic stability and diversity of Pneumocystis carinii infecting rat colonies. lnfect Zmmun 61, 48014813. Davey, 1. Davis, K., Imai, Y., Yamamoto, M. & Matthews, G. (1994). Isolation and characterization of krp, a dibasic endo- peptidase required for cell viability in the fission yeast Schizosaccharomyces pombe. EMBO ] 13,5910-5921. Dyer, M., Volpe, F., Delves, C. J., Somia, N., Burns, S. & Scaife, 1. G. (1992). Cloning and sequence of a B-tubulin cDNA from Pneumocystis carinii : possible implications for drug therapy. Mol Microbiol6, 991-1001. Enderlin, C. S. & Ogrydziak, M. (1994). Cloning, nucleotide sequence and functions of XPR6, which codes for a dibasic processing endoprotease from the yeast Yarrowia lipolytica. Yeast 10, 67-79. P.8 Betzel, C., Dauter, z.8 Wilson, K. 5. & Hol, W. G. J. (1989). Molecul$r dynamics refinement of a thermitase-eglin-c-complex at 1.98 A resolution and comparison of two crystal forms that differ in calcium content. ] Mol Biol210, 347-367. HaidariS, P. J., Wright, T. w . 8 Gigliotti, F. & Haidaris, C. G. (1992). Expression and characterization of a cDNA clone encoding an immunodominant surface glycoprotein of Pneumocystis carinii. J lnfect Dis 166, 1113-1123. Hirono, 5.8 AkagFwa, H., litaka, Y. & Mitsui, Y. (1984). Crystal structure at 2.6 A resolution of the complex of subtilisin BPN with Streptomyces subtilisin inhibitor. J Mol Biol 178, 389414. Klemm, U., MUller-Esterl, W. 81 Engel, W. (1991). Acrosin, the peculiar sperm-specific serine protease. Hum Genet 87,635-641. Kovacs, J. A., Powell, F., Edman, 1. C., Lundgren, B., Martinez, A., Drew, B. & Angus, C. W. (1993). Multiple genes encode the major surface glycoprotein of Pneumocystis carinii. J Biol Chem 268, 6034-6040. Lundgren, B., Lipschik, G.Y. & Kovacs, 1. A. (1991). Purification and characterization of a major human Pneumocystis carinii surface antigen. J Clin Znvest 87, 163-170. McMaster, R. R., Morrison, C. J., MacDonald, M. H. & JOShi, P. B. (1994). Mutational and functional analysis of the Leishmania surface metalloproteinase GP63 : similarities to matrix metal- loproteinases. Parasitology 108, S29-S36. McPhalen, C. A. 81James, M. N. G. (1988). Structural comparison of two serine proteinase-protein inhibitor complexes : eglin-csubtilisin Carlsberg and CI-Zsubtilisin Novo. Biochemistry 27, 65824598. van den Ouweland, A. M. W., van Duijnhove, H. L. P.8 Keizer, G. D., Dorssers, L. C. 1. & Van de Ven, W. 1. M. (1990). Structural homology between the human fur gene product and the subtilisinlike protease encoded by yeast KEX2. Nucleic Acids Res 18,664. Peters, 5. E., Wakefield, A. E., Banerji, 5. & Hopkin, J. M. (1992). Quantification of the detection of Pneumocystis carinii by DNA amplification, Mol Cell Probes 6, 115-117. Ramamoorthy, R., Swihart, K. G., McCoy, 1. J., Wilson, M. E. & Donelson, J. E. (1995). Intergenic regions between tandem gp63 genes influence the differential expression of gp63 RNAs in Leishmania chagasi promastigotes. ] Biol Chem 270, 12133-12139. Roberts, 5. C., Wilson, M. E. & Donelson, J. E. (1995). Developmentally regulated expression of a novel 59-kDa product of the major surface protease (Msp or gp63) gene family of Leishmania chagasi. J Biol Chem 270, 8884-8892. Roditi, I., Schwan, H., Peanon, T. W., Beecroft, R. P., Liu, M. K., Richardson, J. T., Buhring, H. J., Pleiss, J., Bulow, R., Williams, R. 0. & Overath, P. (1989). Procyclin gene expression and loss of the variant surface glycoprotein during differentiation of Trypanosoma brucei. J Cell Biol 108,737-746. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59 2235 E. B. LUGLI, A. G. A L L E N a n d A . E . W A K E F I E L D Roebroek, A. 1. M., Creemers, J. W. M., Pauli, 1. G. L., KurzikDumke, U., Rentrop, M., Gateff, E. A. F., Leunissen, 1. A. M. &Van de Ven, W. 1. M. (1992). Cloning and functional expression of Dfurin2, a subtilisin-like proprotein processing enzyme of Drosophila melanogaster with multiple repeats of a cysteine motif. J Biol Chem 267, 17208-17215. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989). Molecular Cloning: a Laboratory Manual, 2nd edn. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory. Sanger, F., Nicklen, 5. & Coulson, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A 74, 5463-5467. Siezen, R. J., de Vos, W. M., Leunissen, 1. A. M. & Dijkstra, B. W. (1991). Homology modelling and protein engineering strategy of subtilases, the family of subtilisin-like serine proteinases. Protein Eng 4,719-737. Stringer, 5. L., Garbe, T., Sunkin, 5. M. & Stringer, J. R. (1993). Genes encoding antigenic surface glycoproteins in Pneumocystis from humans. J Eukaryot Microbiol40, 821-826. Sunkin, 5. M. & Stringer, J. R. (1996). Translocation of surface antigen genes to a unique telomeric expression site in Pneumocystis carinii. Mol Microbiol 19, 283-295. Sunkin, 5. M., Stringer, 5. L. & Stringer, J. R. (1994). A tandem repeat of rat-derived Pneumocystis carinii genes encoding the major surface glycoprotein. J Eukaryot Microbiol41, 292-300. Tanguy-Rougeau, C., Wesolowski-Louvel, M. & Fukuhara, H. (1988). The Kluyveromyces lactis K E X l gene encodes a subtilisin- type serine proteinase. FEBS Lett 234, 464470. Teplyakov, A. V., Kuranova, 1. P., Harutyunyan, E. H. & Vain: shtein, B. K. (1990). Crystal structure of thermitase at 1-4 A resolution. J M o l Biol214, 261-279. Underwood, A. P., Louis, E. J., Borts, R. H., Stringer, J. R. & Wakefield, A. E. (1996). Pneumocystis carinii telomere repeats are composed of TTAGGG and the subtelomeric sequence contains a gene encoding the major surface glycoprotein. Mol Microbiol 19,273-281. Van de Ven, W. J. M. & Roebroek, A. J. M. (1993). Structure and function of eukaryotic proprotein processing enzymes of the subtilisin family of serine proteases. Crit Rev Oncog 4, 115-336. 2236 Volpe, F., Dyer, M., Scaife, J. G., Derby, G., Stammers, D. K. & Delves, C. 1. (1992). The multifunctional folic acid synthesis fas gene of Pneumocystis carinii appears to encode dihydropteroate synthase and hydroxymethyldihydropterin pyrophosphokinase. Gene 112,213-218. Volpe, F., Ballantine, 5. P. & Delves, C. J. (1993). The multifunctional folic acid synthesis fas gene of Pneumocystis carinii encodes dihydroneopterin aldolase, hydroxymethyldihydropterin pyrophosphokinase and dihydropteroate synthase. Eur J Biochern 216,449458. Wada, M. & Nakamura, Y. (1994). M S G gene cluster encoding major cell surface glycoproteins of rat Pneurnocystis carinii. D N A Res 1, 163-168. Wada, M., Kitada, K., Saito, M.,Egawa, K. & Nakamura, Y. (1993). cDNA sequence diversity and genomic clusters of major surface glycoprotein genes of Pneumocystis carinii. J Infect Dis 168, 979-985. Wada, M., Sunkin, 5. M., Stringer, J. R. & Nakamura, Y. (1995). Antigenic variation by positional control of major surface glycoprotein gene expression in Pneumocystis carinii. J Infect Dis 171, 1563-1568. Webb, J. R., Button, L. L. & McMaster, W. R. (1991). Heterogeneity of the genes encoding the major surface glycoprotein of Leishmania donovani. Mol Biochem Parasitol48, 173-184. Wright, T. W., Simpson-Haidaris, P. J., Gigliotti, F., Harmsen, A. G. & Haidaris, C. G. (1994). Conserved sequence homology of cysteine-rich regions in genes encoding glycoprotein A in Pneumocystis carinii derived from different host species. Infect Zmmun 62, 1513-1519. Wright, T. W., Bissoondial, T. Y., Haidaris, C. G., Gigliotti, F. & Simpson-Haidaris, P. J. (1995). Isoform diversity and tandem duplication of the glycoprotein A gene in ferret Pneumocystis carinii. D N A Res 2, 77-88. Zhang, J. & Stringer, J. R. (1993). Cloning and characterization of an alpha-tubulin-encoding gene from rat-derived Pneumocystis carinii. Gene 123, 137-141. Received 21 January 1997; revised 7 March 1997; accepted 20 March 1997. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Fri, 05 May 2017 06:37:59