Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 <pnas> Titles are limited to three lines or 135 Characters including spaces.</pnas> BIOLOGICAL SCIENCES X-ray structure of the N and C-terminal domain of a coronavirus nucleocapsid protein; structural basis of helical nucleocapsid formation Hariharan Jayaram, Hui Fan&, Brian R. Bowman, Amy Ooi& ,Jyothi Jayaram, Ellen W. Collison, Lescar Julian, B.V.Venkataram Prasad Verna and Marrs McLean Department of Biochemistry and Molecular Biology; Baylor College of Medicine; Houston, Texas, 77030; U.S.A , Department of Veterinary Pathobiology; Texas A&M University; College Station, Texas ,77843;U.S.A; School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551 1 2 Abstract (250 words allowed ..page 2-Current 202): Coronaviridae cause a variety of respiratory and enteric diseases in animals and man including SARS, a disease with emerging global impact. Enveloped capsids of the virus enclose the single stranded genome associated with the nucleocapsid protein ( N protein). Using limited proteolysis we identified two stable domains of the nucleocapsid protein from infectious bronchitis virus. We present here the crystal structure of the N and C-terminal domains (NTD & CTD) of IBV- N protein. The NTD protein with basic residues concentrated on two long tethers in the protein constitutes an RNA interacting module. The CTD exist as intimate domain swapped dimers that tend to organize into helical arrays. Inferring from crystal packing interactions observed at different pHs for the NTD and CTD we hypothesize that the CTD is the key determinant of helical nucleocapsid formation in the virus. Similarity between CTD and the capsid forming domain of a related virus family reveals that this fold constitutes a new class of viral capsid folds that are employed in viruses with helical nucleocapsids. The coronavirus nucleocapsid is thus made up of an N-terminal RNA binding core connected to a C-terminal capsid forming domain that together organize the helical nucleocapsid in the virus. 2 3 Coronaviridae, a member of the order Nidovirales, is a family of viruses with ssRNA genomes which are a significant causative agent of human upper respiratory infections such as common colds and other severe illnesses such as SARS (severe acute respiratory syndrome). The coronaviruses are a family of enveloped positive strand RNA viruses. Their capsids range in diameter from 80 to 160 nm and enclose a single 30kb long segment of positive sense ssRNA(Siddell 1995). Upon infection and cell entry the genomic RNA encodes a 3’ co-terminal set of four or more subgenomic mRNAs with a common leader sequence at their 5’-ends. These subgenomic RNA encode the various viral structural and non structural proteins required to replicate the virus and produce progeny virion capsids. The enveloped capsid of the virus is predominantly made up of the membrane glycoprotein (M) and another small transmembrane protein (E) and an array of spikes composed of the spike protein glycoprotein (S) which gives the spherical particles a corona. A significant protein component of the capsid is the nucleocapsid protein (N), which interacts with the genomic ssRNA forming the central core of the virion. Electron microscopic studies of detergent permeabilized transmissible gastroenteritis virus capsids (TGEV a prototype coronavirus) revealed that the internal nucleocapsid is helical and is composed of the ssRNA genome tightly associated with N-nucelocapsid protein(Risco, Anton et al. 1996; Risco, Muntion et al. 1998). The N protein is typically a multifunctional basic phosphoprotein of molecular weight 50kDa to 60kDa.and its coding RNA and protein is synthesized in large amounts during an infection(Stohlman and Lai 1979; Lai and Cavanagh 1997). 3 4 The highly basic N protein was shown to have a general RNA binding ability with an increased affinity for corresponding viral RNA(Cologna and Hogue 1998) and bind consensus sequences at their 5’ and 3’ termini. During the virus life-cycle multiple copies of the N protein interacts extensively with the genomic as well as the subgenomic RNA that are synthesized (Baric, Nelson et al. 1988; Narayanan, Kim et al. 2003) and possibly participates in genome packaging which is initiated by recognition of a packaging signal by the M-protein. The M and N protein also interact closely via their C termini , an interaction which is very important for proper genome encapsidation and nucleocapsid formation. In addition to this the N protein also plays a role in controlling mRNA transcription, and translation and replication(Lai and Cavanagh 1997; Tahara, Dietlin et al. 1998; Schelle, Karl et al. 2005). The abundance of N produced during an infection results in N playing an important role in host modulation during coronavirus infection. Accordingly the N protein has been shown to interact with cycophilin an immuno-modulator ,activate the AP1 pathway involved in cell cycle control, enter the nucleus as well as induce apoptosis in certain cell types(He, Leeson et al. 2003; Luo, Luo et al. 2004; Surjit, Liu et al. 2004). The N protein is also a major immunogen and an important diagnostic marker for coronavirus disease(Leung, Tam et al. 2004) and can help improve the efficacy of avian coronavirus vaccines(Cavanagh 2003; Zhao, Cao et al. 2005). 4 5 Materials and Methods Purification of full length nucleocapsid protein and identification of tryptically stable fragments: Full length nucleocapsid protein was expressed as before. The purified protein was further purified by heparin affinity chromatography, concentrated to 1-2 mg/ml and was checked for monodispersity by dynamic light scattering ( Dynapro ) and negative stain electron microscopy. Limited proteolytic cleavage of full length N protein (1-2 mg/ml) was carried out with 2% (wt trypsin /wt protein) sequencing grade trypsin (Roche) to identify tryptically stable domains. The identity of the amino termini of the proteolytic product(s) was ascertained by N-terminal amino acid sequencing of band following gel-electrophoresis and blotting onto PVDF. For construct optimization the carboxy termini were estimated based on predicted secondary structure in terminal region and mass spectrometric characterization of proteolyzed protein. Cloning, expression purification and crystallization of the tryptic fragments of nucleocapsid protein: All proteins were cloned and expressed as GST fusion proteins using the pet41 EkLIC vector (Novagen) using the LIC methodology. The expressed protein was purified using affinity on glutathione S sepharose (Pharmacia) followed by on-bead cleavage with enterokinase (EK-Max Invitrogen). The cleavage reaction was performed by suspending 1 ml of beads in 40 ml of cutting buffer (250 mM NaCl, 50 mM Tris-HCl ph 8.0) with 10 units of protease. Following proteolysis the dilute supernatant was purified further by gel filtration chromatography on a superdex 75 16/60 column ( Pharmacia). The purified N-terminal and C-terminal domains was concentrated to 5-8 mg/ml and used for crystallization trials using Crystal Screen I (Hampton Research). 5 6 followed by the Index screens 2 and 3 (Jena Biosciences) which were used to design optimization strategy. Data Collection and phasing: Data was collected at various beamlines as indicated in Table I. For each crystal 180 or 360 oscillation images with 1 oscillation angle were collected using the inverse beam approach with a wedge size of 30 in the case of MAD data sets and a continuous wedge of 180 for the native data sets. For the NTD the data were phased using molecular replacement in PHASER with the NTD coordinates (Hui Fan et. al.). Following molecular replacement, further model building and refinement were performed in a similar manner to the C-term below. For the CTD the native and selenomethionine data were integrated and scaled using the HKL2000 suite.From the 2 wavelength multiwavelength anomalous dispersion dataset obtained for selenomethionine substituted protein ( pH 4.5 crystal form, Table I) four methionine positions were located using the SHAKE and BAKE program. The initial solution was then refined, phases calculated and density modified using SHARP. The model was built using COOT and refined using a combination of CNS which was used for initial rounds of simulated annealing refinement. followed by refinement using REFMAC5. The structure of CTD in other crystal forms were phased using molecular replacement from above model as implemented inn the program PHASER. Model bias was reduced by using the prime and switch methodology implemented in SOLVE/RESOLVE. All figures were generated using Pymol and Espript (figure 5) and annotated in Adobe Illustrator. 6 7 Results and Discussion: The full length N protein from infectious bronchitis virus had been purified and characterized previously(Zhou and Collisson 2000). The N protein has strong interactions with 5’and 3’ conserved sequences of IBV RNA and also undergoes phosphorylation in infected cells to generate multiple isoforms . Our structural characterization of full length N protein was impeded by its aggregation and degradation on storage under a variety of conditions. Purified full length N protein was also extremely polydisperse in solution as characterized by dynamic light scattering analysis and not amenable to detailed structural characterization using protein X-ray crystallography. We employed the divide and conquer approach to study the protein structurally. Using limited proteolysis we sought to identify regions of the protein that represented stable domains that were resistant to proteolysis under limiting amounts of proteases trypsin (that cleaves after basic residues Arg and Lys) and V8 protease (cleaves after acidic residues Glu and Asp). The digestion pattern with v8 protease was not very distinct and yielded several diffuse bands( data not shown). Trypsin proteolysed the full length protein to a single ~17 kD band on a 17% denaturing SDS-PAGE gel within 15 minutes of trypsinization. The “single” band thus observed was resistant to further degradation even upon typsinization for several hours and represented a stable region(s) of the protein. Using N-terminal sequencing of the cleavage fragment we identified four tryptic fragments: two major cleavage sites that corresponded to cleavage at residues19 and 219 and two secondary cleavage sites at residues 27 and 226 (Figure 1a). The optimized domain constructs termed NTD (N terminal domain) and CTD (C-terminal domain) were 7 8 then cloned, expressed and purified to homogeneity. The N-terminal domain thus identified was monomeric at moderate concentrations concentrations while the Cterminal domain protein was a dimer even at very low concentrations as assayed by gelfiltraion chromatography. The NTD and CTD proteins tended to aggregate during purification and thus was purified at very low concentrations and concentrated only prior to crystallization screening. The NTD and CTD proteins also failed to interact at a variety of salt and protein concentrations as assayed by gel-filtration co-fractionation and pull down experiments (data not shown). NTD and CTD therefore represent independent domains of the full length protein and were suitable for structure determination separately. Crystals of both the NTD and CTD were obtained in a variety of conditions. For NTD initial phasing attempts were carried out with several mutagenized proteins with incorporated methionines since the wild type IBV-GRAY sequence in this region (residues119-162) did not possess any Cys or Met residues. These mutant protein crystals failed to yield a structure owing to pseudo-body centering and diffraction to moderate resolution (~3.5 Å) that seriously limited the quality of the anomalous signal and made phasing impossible.The crystal structure of a similar construct of IBV-N (Beaudette strain) was then solved by some of the co-authors and was therefore used as a model to phase the high resolution 1.3 Å data obtained for wild-type N-protein using molecular replacement. The CTD phases were successfully obtained using anomalous data collected at two wavelengths. The four methionine position identified yielded d an excellent map with an initial FOM of 0.65. to 2.2 Å. Although almost 80% of the model could be traced 8 9 using automated tracing as implemented in ARP-WARP, manual building yieleded a refined model which was used to phase other native data for CTD using molecular replacement. In all we have solved the structure of NTD in 2 pacegroups and CT in three spacegroups in this study. The multiple packing arrangements seen for NTD and CTD in the structures presented here suggest possible modes of interaction for these domains of the nucleocapsid protein and suggest a possible model for nucleocapsid organization in coronaviruses. High resolution structure of the N-terminal domain: The secondary structure and fold of the N-terminal domain solved at 1.3 Å is almost identical to the structure of the IBV N-protein Beaudette reported before with the exception of five additional residues discernible at the N-terminus.Briefly the structure is composed of a relatively acidic globular core made up a twisted anti-parallel β-sheet center surrounded by a number of loop regions. Prominent among the loop regions are two long loops corresponding to the N-terminal 12 amino acids of this domain (residues 22 to 34) and a looped loop region from residues 74 to 86 that extended outward like long tethers from this globular base and a monomer forming a U shaped monomer. The previous structure NTD by Hui et. al. consisted of two dimers in the asymmetric unit in which the “U” shaped monomers were arranged in a side to side fashion (Figure 1 b). The NTD in this study crystallized as an asymmetric homodimer with two interlocking monomers arranged in a head to tail fashion in the crystal and asymmetric unit. The interlocking of the U shaped monomers is aided by the predominantly basic N-terminal 9 10 residues from 22 through 29 which extend outward (figure 4a) and occupy an acidic groove situated on one phase of the base of the U. The extended conformation of this Nterminal “tether” arm interacting with the neighboring monomer in the ASU is in contrast to its random loop conformation in the structure by Hui et. al.. Besides the N-terminal region the other basic protruding loop segment consisting of highly basic residues from residues 74 to 88 that wrap around the other phase of the NTD monomer.The region from 79 to 83 is disordered in this monomer as is the entire loop from residues 75 to 87 in the neighboring monomer . The natural twist in the asymmetric dimeric interaction results in this loop being alternatively interacting with the NCS related monomer and exposed and disordered. This flexible loop which is full of conserved basic residues therefore represents a mobile region which is prone to interacting with other NTD monomers and free to interact with other ligands like RNA. The buried surface area in this NTD dimer is 2168 Å2 and represnts a faily tight dimeric interaction. The dramtic difference in packing by the NTD dimers observed here possibly results from the presence of magnesium ions and pH which are different between the two oberveered crystal forms. The resukting propagated head to tail interaction resulting in a linear array made up of NTD monomers which have overhanging basic loops from alternating monomers (figure 4a) and possibly represent the RNA interacting regions in the NTD which was clearly shown to bind RNA (Hui et al)T.hese basic loops with their dynamic and pH dependent protein-protein interactions may play an important role in nucleocapsid assembly and dis-assembly. . 1 0 11 Structure of the CTD: The CTD had to be purified under extremely dilute conditions and migrated as a dimer during gel-filtration. The concentrated protein crystallized readily as needles, rods or flat sheets indicating a strong tendency of the protein to organize itself in two dimensions. AT a slightly lower pH as crystal form I (Table I) hexagonal crystals were obtained which were un-usually three dimensional . These bipyramidal crystals were however poorly packed and diffracted extremely anisotropically to 3.5 Å resolution and gave an almost helical diffraction pattern when diffracted along the long axis of the crystal.This behavior is also characteristic of strong tenedency of the protein to organize along two dimensions. The CTD was successfully phased from two wavelength anomalous data obtained for selenomethionine substituted protein from rod shaped crystals which were very rarely obtained in crystal condition I. The phases for CTD in two other crystal forms were obtained by molecular replacement. Structure of The CTD dimer: The CTD exists in all three crystal forms as an intimate domain swapped dimer (Figure 2). The domain swapping is brought about by interaction between β-strands of one monomer with surrounding helices and loops from the other monomer to form a reciprocated, closed domain swapped dimer akin to that seen in crystal structures of cystatin A and RNAseA(Janowski, Kozak et al. 2001; Newcomer 2001). Accordingly a 12 residue long β-strand β2 (295 and 307) constitutes the interface 1 1 12 between the two monomers (Figure 2 bottom). The overall topology of the dimer of IBV-N can be said to be a concave β-stranded floor of ~400Å2 area with the topology β1B-β2B-β2A-β1A surrounded by helices and loops. The helices 3 and 4 connected by loop region arch over this floor and constitute the roof of the dimer. A 12 residue long αhelix α5 located at the extreme C-terminus of CTD forms an angled wall that flanks either side of the dimer and is held in place by a tight turn made up residues 307 to 310(purple boxed residues Figure 2 and Figure 5). The integrity of the dimer observed in solution is apparent when one considers the ~5000 Å2 buried surface area involved in the dimerization.The dimeric structure observed at pH 4.5 was almost identical to all four dimers observed at pH 8.5 in the asymmetric unit and the dimer observed in the ASU for crystal form III with the rmsd. for Cα-atoms in the core region (233 to 328) being ~0.3 Å. This observation is in concordance with several biochemical studies which mapped the dimerization domain to this stretch in several homologs (Surjit, Liu et al. 2004; Yu, Gustafson et al. 2005). The presence of a dimer in the ASU in two crystal forms and 4 dimers in the asu in the other crystal form allowed the analysis of dimer-dimer interactions not only at different pHs and crystallization conditions but also in the presence and absence of any constraints imposed by crystal packing. Crystal packing interactions in CTD insights into stability of helical packing interactions: The two dimeric structures presented here result in five kinds of inter-dimer interactions. Crystal packing in crystal I is brought about by dimer-dimer interactions 1 2 13 with the nth dimer interacting with n-1 dimer and n+1 dimer from neighboring ASU (Figure 3b) burying a surface area of 1182 Å2. In crystal II with 4 dimers in the ASU, inter-dimer interactions are responsible for keeping the four dimers in the ASU together as well as mediating crystal packing (Figure 4a). Accordingly this gives rise to four classes of dimer-dimer interactions. Two of them (termed class I dimer-dimer interactions)i.e AB-CD, CD-EF and the crystal packing interaction wherein GH dimer from one ASU interacts with the AB dimer from the neighboring ASU (i.e GH:ABn+1) belong to the same class as seen in the pH 4.5 crystal form and bury an almost similar surface area of 1122 Å2 . This high pH crystal form also displays a new class of “dimer-dimer” interactions and it involves the interaction between the GH dimer with an interface formed by the CD-EF dimer (Figure 4a). This tri-dimeric interaction buries a surface area of 1385 Å2 The uniformity of all but the last kind of dimer-dimer interactions observed in two crystals is apparent from a superposition of all four types of dimer-dimer interactions observed between the two crystals whereby the dimers all superpose with a minimum of 0.3 Å rmsd and a maximum of 0.8 Å rmsd (yellow dimer inFigure 3b ). When the three dimers (Dn+1-D-Dn-1) from three neighboring ASUs from crystal I are superposed from the three dimers from within the ASU of crystal II the rmsd between them is ~1.0 Å. These interactions primarily involve residues between 308 and 328 which constitute a type II turn (TT in Figure 5) and 5 and the terminal loop in CTD (Figure 1 lilac boxes). Apart from the class I dimer interactions the crystal packing interactions in crystal II (dimer GH interacting with dimer ABn+1) bury only a surface area of 600 Å2 1 3 14 and are brought about by a swiveling away of the GH dimer prompted possibly by its strong interaction with CD-EF dimers from within the ASU. This clearly indicates that the dimers tend to swivel only slightly w.r.t each other and constitute a subtle module that is well suited to interacting with itself (arrow figure 3b middle panel). Although there is not significant surface complementarily between the two molecules the predominant interaction between dimers is a salt bridge between Arg-308 from one dimer and Asp-314 from a neighboring dimer (Figure 3 b ). The salt bridge and the orientation of the dimers remain almost identical between the structures at pH 4.5 and pH 8.5. The inter dimer interactions other than for the salt bridge are strictly Vanderwaal interactions. The multimerization interaction in addition to the dimerization interactions seen in CTD very well maintained over this wide range of pHs. The ionic strength of the two crystal conditions is also different thereby providing further evidence as to the stability of dimerdimer packing interactions. Interestingly the structure of crystal form II of the CTD which is of the IBV-Beaudette strain has a cysteine residue instead of the Arg 308 in this position. Although no inter-dimer disulfide bond is seen in the structure interaction of cysteine residues across this dimeric interface may facilitate disulfide bond formation that replaces the salt bridge in this strain of IBV-N protein. The additional dimer (GH) is clearly auxiliary (and not part of the primary fibre see below) and reveals a higher mode of interaction with CTD dimers. The interacting surface comprises residues from all over the dimers (underlined residues Figure 1, T2 dimer interface figure 3a). Since this interaction involves three different molecules and yet the buried surface area is similar (~1200 Å2)as the primary crystal-packing (or fiber 1 4 15 forming interaction), we hypothesize that it is less likely and therefore secondary to the primary interaction seen for other dimers. Considering this dimer mediates crystal packing in this spacegroup by the same region on its other face, the tight salt bridge observed between R308 and D315 is preserved in only one of the cases and disrupted in the two fold related case. Despite this skewing the overall rmsd is only 0.8 Å indicating the extreme adaptability of the dimer with α5 and preceding loop mediating these interactions. This additional interaction also leads to the possibility that the fibre-hexamer made up of dimer 1-2-3 with a dimer 4 appendage could circularize or form planar triangles under certain conditions with the GH dimer serving as a bridge to bring the otherwise rigid 1-23 fibres together. Such bridging interactions may indeed be necessary for spherical particle formation driven by triangularization of three hexamers with the fourth dimer serving as the linker. In addition the greater flexibility of various regions of the protein at alkaline pH as obtained from their large B factors coupled with the swiveling seen by dimer 4-dimer 1 crystal packing interaction in crystal II could represent a snapshot into the dis-assembly of dimer-dimer interactions considering how this may be important for nucleocapsid disaasembly and genome release. Electrostatic surface, conservation of surface residues and interaction with other other capsid components: The Grasp surface of the N-term crystal clearly shows a predominantly basic exposed patches and th heat to tail linear array formed by NTD 1 5 16 dimers interlocking with each other aided by the N-terminal basic loop protrusion. The NTD fiber is quite loose and is possible more flexible to allow for interactions with RNA. Besides a significant part of the loop from residues 74 to 88 is disordered in every alternating dimer.in the crystal (Figure 4a). The looseness of this dimer might be in order to compensate for the relative rigidity of the NTD dimer. Analysis of the GRASP surface of the CTD fibre as constructed from the pH 8.5 octamer further reveals that the surface is primarily acidic with a swath of basic residues running in an expectedly helical fashion throught the fibre (Figure 4b). Although the pimary interactions with RNA are conferred by the N terminus secondary interactions may be facilitated by this basic stretch which is clearly solvent exposed. The clearly demonstrated role of the CTD in mediating dimerization and the demonstrated affinity of the NTD construct for RNA therefore suggests a degree of specialization and cooopertivity between the two domains, with the CDT mediating dimerization and secondary interactions with RNA and the NTD also mediating23 oligomerization important for fibre formation and interactions with RNA. Such atwo domain organization is similar to the nucleocapsid protein of HIV where one of the domains (Gamble, Yoo et al. 1997) Fibre formation : The clear tendency of the dimer-dimer interaction to promote fibre formation is evident from superposition of three dimers from both spacegroups (Figure 4c). The relevance of this interaction is greater when one considers that it occurs as discussed above at both pHs and also occurs free of crystal packing induced forces at the 1 6 17 alkaline pHs. The dimer induced fibre formation is even more striking when one puts it in context of the relatedness of the protein to another capsid forming domain N protein from a related virus. Similarity to other nucleocapsid proteins and evolutionary implications for viral architechture: A DALI search of the PDB revealed a very striking similarity to the 73 amino acid capsid forming domain of PRRSV a corona like virus which is a member of the nidovirales family. This match had a high similarity Z-score with a corresponding RMS deviation of 2.8 Å .PRRSV a corona like virus is also a + single stranded RNAvirus with a similarly large genome. PRRSV also forms a helical nucleocapsid and the full length N-protein was shown to form fibers in solution for the full length protein(Doan and Dokland 2003). Similar helical nucleocapsids have been observed in orthomyxovirus, paramyxovirus, flivovirus, rhabdovirus , bunyavirus and arenavirus families all of which contain genomic RNA associated with their respective nucleocapsid proteins(Narayanan, Kim et al. 2003). The capsid forming domain in PRRSV also packed into helical arrays using crystal contacts in the crystal studied. The arrangements of CTD, PRRSV and MS2 coat protein all show a similar feature of an anti-parallel beta strand floor with flanking helixes and loops. The major difference between the two structures lie in the fact that the CTD floor is more concave while the PRRSV floor is perfectly flat. Besides this the number of surrounding loops and helical regions are greater for CTD considering that it is almost 1 7 18 120 residues long compared to the 90 residue length of PRRSV-capsid forming domain. This fact taken together with the interaction seen in the PRRSV crystal packing interaction similarly mediated by helix helix Vanderwaal stacking and a similar saltbridge between Arg 65 and Asp43 in PRRSV suggests a common theme in helical fibre formation across the viruses in the Nidovirales family to which PRRSV and IBV both belong. This strengthens the suggestion that this fold is commonly employed in viruses with helical nucleocapsids. Also despite the very low sequence homology between SARS-N and IBV-N (25%)the predicted secondary structure of SARS-N for the CTD domain matches the observed secondary structure of IBV-N very closely (Figure 5 black topology diagram top). The NMR structure for the N-terminal domain for SARS-N clearly shows that The N – terminal domain is largely composed of coiled structure and interacts with RNA in solution(Huang, Yu et al. 2004).The A similar solution structure by NMR of a part of the dimerization domain of SARS coronavirus reveals a similarity to the PRRSV capsid protein as reported in this publication but differ from this study in the arrangement of the C-terminal helix which is the key mediator of dimer-dimer –interactions which we hypothesize are the determinants of helical nuclocapsid formation(Chang, Sue et al. 2005). The helix corresponding to helix α5 in the SARS structure packs against the helix from the same dimer forming an asymmetric homodimer(Chang, Sue et al. 2005). The IBV and PRRSV dimer are very symmetric homodimers and have the same helix mediating dimer-dimer interactions which are thee main determinant of strand formation. It is quite likely that the absence of the residues N-terminal to the β-stranded floor might 1 8 19 have easily allowed the C-terminal helix in SARS to move dramatically and interact with itself within the dimer. This interaction might not be possible in the context of the whole protein and the nucleopcapsid as seen in the structure of IBV-N protein and PRRSV N-protein. The overall structural similarity between PRRSV and CTD here clearly indicates that these viruses within the Nidovirales order are more similar than previously thought and hints at this architecture being a characteristic fold adopted by helical nucleocapsid viruses. Genome organization in coronaviruses as suggested from the structure of NTD and CTD. The NTD with its demonstrated RNA binding activity (Hui et al) and the clearly dimeric CTD are two highly adaptable modules on an otherwise largely flexible and possible disordered protein(Wang, Wu et al. 2004).The two basic tethers in the NTD possible are held alongside a C-term mediated fibre with the tethers grabbing onto and sequestering RNA. This NTD_CTD_RNA superstructure possible then packs via secondary interactions made possible by both RNA interacting with CTD and the CTDCTD class II dimer-dimer interactions and also possibly the NTD-NTD dimeric interactions to form a highly compacted ribonucleoprotein complex. The C-terminus of N-protein is also known to interact with the CTD of M-protein which is predominantly basic.This may be possible by interactions of M with the acidic patches on the CTD mediated fibre. These interactions possible explain the rescue of unstable M-protein mutants by compensatory mutations in the CTD region of MHV-N (Kuo and Masters 1 9 20 2002)) indicating a strong interaction mediated in part by this region. Together This suggests a model for genome organization wherein the CTD domains form a helical template with extending NTD RNA-grabbers that organize the genomic RNA that is brought along for the ride by interactions of consensus packaging signal with M protein which nucleates along the CTD fibre by interacting with it. The CTD-fiber thus serves as a structural template for the NTD-RNA complex to wind around with intermittent interactions between M and CTD. Once assembled this complex is not prone to disruption by treatment with RNAse A as observed by Narayannan et al(Narayanan, Kim et al. 2003) 2 0 21 Baric, R. S., G. W. Nelson, et al. (1988). "Interactions between coronavirus nucleocapsid protein and viral RNAs: implications for viral transcription." J Virol 62(11): 4280-7. Cavanagh, D. (2003). "Severe acute respiratory syndrome vaccine development: experiences of vaccination against avian infectious bronchitis coronavirus." Avian Pathol 32(6): 567-82. Chang, C. K., S. C. Sue, et al. (2005). "The dimer interface of the SARS coronavirus nucleocapsid protein adapts a porcine respiratory and reproductive syndrome virus-like structure." FEBS Lett 579(25): 5663-8. Cologna, R. and B. G. Hogue (1998). "Coronavirus nucleocapsid protein. RNA interactions." Adv Exp Med Biol 440: 355-9. Doan, D. N. and T. Dokland (2003). "Structure of the nucleocapsid protein of porcine reproductive and respiratory syndrome virus." Structure (Camb) 11(11): 1445-51. Gamble, T. R., S. Yoo, et al. (1997). "Structure of the carboxyl-terminal dimerization domain of the HIV-1 capsid protein." Science 278(5339): 849-53. He, R., A. Leeson, et al. (2003). "Activation of AP-1 signal transduction pathway by SARS coronavirus nucleocapsid protein." Biochem Biophys Res Commun 311(4): 870-6. Huang, Q., L. Yu, et al. (2004). "Structure of the N-terminal RNA-binding domain of the SARS CoV nucleocapsid protein." Biochemistry 43(20): 6059-63. Janowski, R., M. Kozak, et al. (2001). "Human cystatin C, an amyloidogenic protein, dimerizes through three-dimensional domain swapping." Nat Struct Biol 8(4): 316-20. Kuo, L. and P. S. Masters (2002). "Genetic evidence for a structural interaction between the carboxy termini of the membrane and nucleocapsid proteins of mouse hepatitis virus." J Virol 76(10): 4987-99. Lai, M. M. and D. Cavanagh (1997). "The molecular biology of coronaviruses." Adv Virus Res 48: 1-100. Leung, D. T., F. C. Tam, et al. (2004). "Antibody response of patients with severe acute respiratory syndrome (SARS) targets the viral nucleocapsid." J Infect Dis 190(2): 379-86. Luo, C., H. Luo, et al. (2004). "Nucleocapsid protein of SARS coronavirus tightly binds to human cyclophilin A." Biochem Biophys Res Commun 321(3): 557-65. Narayanan, K., K. H. Kim, et al. (2003). "Characterization of N protein self-association in coronavirus ribonucleoprotein complexes." Virus Res 98(2): 131-40. Newcomer, M. E. (2001). "Trading places." Nat Struct Biol 8(4): 282-4. Risco, C., I. M. Anton, et al. (1996). "The transmissible gastroenteritis coronavirus contains a spherical core shell consisting of M and N proteins." J Virol 70(7): 4773-7. Risco, C., M. Muntion, et al. (1998). "Two types of virus-related particles are found 2 1 22 during transmissible gastroenteritis virus morphogenesis." J Virol 72(5): 402231. Schelle, B., N. Karl, et al. (2005). "Selective replication of coronavirus genomes that express nucleocapsid protein." J Virol 79(11): 6620-30. Siddell, S. G. (1995). The Coronaviridae:an introduction, Plenum Press, New York, N.Y. Stohlman, S. A. and M. M. Lai (1979). "Phosphoproteins of murine hepatitis viruses." J Virol 32(2): 672-5. Surjit, M., B. Liu, et al. (2004). "The SARS coronavirus nucleocapsid protein induces actin reorganization and apoptosis in COS-1 cells in the absence of growth factors." Biochem J 383(Pt 1): 13-8. Surjit, M., B. Liu, et al. (2004). "The nucleocapsid protein of the SARS coronavirus is capable of self-association through a C-terminal 209 amino acid interaction domain." Biochem Biophys Res Commun 317(4): 1030-6. Tahara, S. M., T. A. Dietlin, et al. (1998). "Mouse hepatitis virus nucleocapsid protein as a translational effector of viral mRNAs." Adv Exp Med Biol 440: 313-8. Wang, Y., X. Wu, et al. (2004). "Low stability of nucleocapsid protein in SARS virus." Biochemistry 43(34): 11103-8. Yu, I. M., C. L. Gustafson, et al. (2005). "Recombinant severe acute respiratory syndrome (SARS) coronavirus nucleocapsid protein forms a dimer through its Cterminal domain." J Biol Chem 280(24): 23280-6. Zhao, P., J. Cao, et al. (2005). "Immune responses against SARS-coronavirus nucleocapsid protein induced by DNA vaccine." Virology 331(1): 128-35. Zhou, M. and E. W. Collisson (2000). "The amino and carboxyl domains of the infectious bronchitis virus nucleocapsid protein interact with 3' genomic RNA." Virus Res 67(1): 31-9. 2 2 Dataset (crystal condition) CTD PEG 4000, 28.75% to 29.5 %, pH 4.8 Citrate, 0.1 M MgCl2 X-ray source/Wavelength Resolution Number of reflections Completeness Redundancy 23 SBC-CAT 19ID Advanced Photon Source (Argonne) Space Group P2(1)2(1)2(1) a=38.389 b=65.939 c=92.306 α=90.000 β=90.000 γ=90.000 Rsym 0.97937 Å 50-2 Å (360º , 1º oscillation) 15139 96% (68.9%) 11(8.2) 0.072(0.28 0.97951 Å 50-2Å (360º , 1º oscillation) 28037 97.9(85.5) 5.3(3.2) 0.076(0.35 CTD BIOCARS-14ID Space Group P2(1)2(1)2 30% PEG Advanced Photon a=108.99 b=128.534 c=71.435 α=90 β=90.00 γ=90.00 4000, 100 Source (Argonne) mM Tris0.9000 Å 50-2.2 Å 97377 99.7(98.8) 3.4(3.1) 0.081(0.58 HCl pH 8.6, (180º , 1º oscillation) 800 mM LiCl NTD BIOCARS-14ID Space Group C 1 2 1 25 % PEG Advanced Photon a= 100.055 b=46.210 c=74.176 =90.00 =121.06 =90.00 4000,100 Source (Argonne) mM MES 0.9000 Å 50-1.3 Å 204381 87.9(60.5) 3.3(2.6) 0. Sodium Salt (180º , 1º oscillation) 6.2, 200 mM Magnesium Chloride Refinement Statistics Parameters PDB 1 (1778 atoms, 92 PDB 2 (7169 atoms 215 solvent PDB water molecules) atoms) atom Resolution Range 50-2 Å 48-2.2 Å 50-1 Number of reflections 15110 48564 6075 Rcryst 0.238 0.236 0.21 Rfree 0.269 0.291 0.250 Mean Bond length deviation 0.005491 0.006096 0.008 Mean Bond angle deviation 1.30545 1.31765 1.235 Ramachandran statisics 94.2% most allowed, 5.8% 91.6% most allowed, 7.6% 886% additional allowed additional allowed , 0.6% addit generously allowed, 0.2% 0.5 % disallowed (poor density) Values in parenthesis are for outermost resolution shells Rsym = hi|Il(h) - <I(h)|/hiIl(h) Rvalue = (|Fobs|-k|Fcal|)/|Fobs| Rfree is calculated based on 10% of reflections not used during the refinement 2 3 24 2 4 25 2 5