* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Small-scale platform for high-throughput identification of proteins
Ancestral sequence reconstruction wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Gene regulatory network wikipedia , lookup
Gene expression profiling wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Secreted frizzled-related protein 1 wikipedia , lookup
Protein (nutrient) wikipedia , lookup
Paracrine signalling wikipedia , lookup
Magnesium transporter wikipedia , lookup
Protein structure prediction wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
List of types of proteins wikipedia , lookup
Gene expression wikipedia , lookup
Protein moonlighting wikipedia , lookup
Interactome wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein adsorption wikipedia , lookup
Western blot wikipedia , lookup
Small-scale platform for high-throughput identification of proteins suitable for X-ray structure determination Ronnie O. Frederick, Won Bae Jeon, Frank C. Vojtik, Megan A. Riters, John Kunert, Paul G. Blommel, Fred R. Blattner, Russell L. Wrobel, Craig A. Bingman, Craig S. Newman, John G. Primm, George N. Phillips Jr., John L. Markley, and Brian G. Fox University of Wisconsin-Madison, 433 Babcock Drive, Madison, Wisconsin, USA 53706-1549, http://www.uwstructuralgenomics.org Introduction: Small-scale cell based expression screening and preliminary identification of eukaryotic proteins suitable for structural determination The Center for Eukaryotic Structural Genomics (CESG) has developed and currently uses a rapid small-scale high-throughput screen for identifying positively expressed cloned genes isolated from various eukaryotic sources such as Arabidopsis thaliana, rice, human, yeast, and mouse. So far, of over 2,500 genes that have been screened for expression on the small-scale using E. coli host strains Rosetta-2 and B834-pRARE2, we have found that ~64% are successfully expressed at a small-scale (0.4 mL Terrific Broth or TB). Of ~1,600 cloned genes that were identified to be expressed on a small-scale, ~71% have been attempted on a large-scale (2 L) using methionine auxotroph E. coli host, B834-pRARE (that is required for selenomethionine, 15N, or 13C/15N enrichment). Of these, ~55% have yielded a soluble fusion protein that was finally suitable for X-ray crystallization screening trials (see Table 1). Our present screening protocol has been designed for speed and enables positive expressing targets to be identified within 3 days (see Figure 1 and 2). We are currently developing and implementing new protocols that increase the total number successful soluble targets going through to X-ray crystallization screening and reduce the numbers of failed targets that occur downstream (see Figure 3). Table 1 shows that there is a significant attrition of target proteins throughout the current CESG pipeline scheme. We are currently implementing new protocols and methodologies that enable earlier screening and removal of unsuitable target proteins at the small-scale expression screening stage. The changes in the CESG pipeline design required to achieve these improvements are: cell culture production (using various E. coli host strains) and growth at a small-scale, micro-scale protein production and purification, TEV cleavage (of fusion enhancers), protein detection using novel fluorescent stains, and protein solubility and mass analysis. We have successfully used these new approaches to pre-screen and purify large amounts of several different target proteins using small-scale or micro-scale cell based protein production and purification techniques. These proteins were then used successfully in X-ray crystallization screens using a Fluidigm microcircuit system. Background: Design of expression plasmid and strategies for producing soluble target proteins The eukaryotic genes selected by CESG are fused to an N-terminal (His)ntagged (n=6 or 8) maltose binding protein (MBP which enhances solubility and expression levels), and a TEV protease cleavage site is located between the MBP and target protein (just in front of the cloned gene segment). The transcription of the cloned genes is controlled by a T5 phage based promoter that is IPTG or lactose inducible. The eukaryotic genes were amplified using PCR and inserted at the C-terminal end of this (His)ntag-MBP-TEV cassette fusion using a novel site specific recombination methodology. The architecture of this expression plasmid allows for flexibility in manipulating any future design modifications of the plasmid. For example, the (His)n-tag segment has been increased from 6 to 8 histidine repeats (this has improved downstream protein purification steps) and the MBP segment can be readily replaced with thioredoxin, glutathione Stransferase (GST), NusA, or other solubility enhancers. In addition, with all of the cloned genes stored in inert “entry vectors”, it will be easy to re-insert the genes into any of the future iterations of the expression plasmid. For small-scale expression screening, we use Rosetta-2, an E. coli host strain supplemented with seven rare tRNA codons (codons provided by a plasmid pRARE2). A second E. coli host, B834-pRARE2, is used for largescale 2-liter cell growths. Figures 1 and 2 show the overview CESG pipeline workflow and small-scale expression screening methods. The small-scale expression trials are performed in 96-well growth blocks, at 25oC for 24 hours using either 1 mM IPTG for induction or an autoinduction method developed by Professor Studier (Novagen Overnight Express Autoinduction SystemTM). Presently, all genes that expressed successfully on a small-scale were scored “expression +” and recorded in a laboratory information management system (LIMS) called Sesame. These positively expressed genes were then used to transform B834-pRARE2 and grown on a 2-liter large-scale to produce protein for purification and structural studies. Bioinformatics Gene selection LIMS: Sesame Positive clones Cloning: Gateway and PCR WG# Received DNA 96-well Expression Screen: 96-well DNA 96-well PCR plate B834-RARE2 B834-RARE2 metmet- E. coli strains Isotopic label Small-scale purification DNA -80oC 96-well plate SG13009 SG13009 Rosetta-2 Rosetta-2 MG1655 MG1655 MDS-42 MDS-42 New E. coli strains being tested E. coli strains with reduced and minimal genomes No label No label SeMet and 13C/15N Barcode# Large-scale 2-L cell growth NMR studies Protein purification: IMAC X-ray studies The cells used for expressing target proteins were Rosetta-2 cells (using Novagen Overnight Express Autoinduction SystemTM). One 96-well growth block was processed (16 times 0.5 mL of culture per target protein). For X-ray crystallization trials ~50-200 micrograms of purified target protein is needed (10 mg/mL) in a volume range of 5-50 microliters. For protein purification, we uses an automated AKTA HPLC. Fusion proteins are initially purified from ~8 ml cell volume by 1st IMAC capture using 1 ml Ni affinity column (Amersham Biosciences). After desalting, fusion proteins are subjected to TEV proteolysis. After cleavage, (His)n-MBP fusion tag and (His)6-TEV protease are separated from target proteins by a 2nd IMAC capture. Target proteins are desalted against the buffers specified according the pI values of targets. Target are concentrated to approximately 10 mg/ml by using Amicon Microcon® YM-3 or YM-10. Final protein concentrations are determined by BCA protein assay using BSA as a standard protein. Figure 1. Summary of the high-throughput protein production CESG pipeine. Workgroups (WG) of 96 individual eukaryotic cloned genes are processed in this system at one time. Three types of E. coli host strains (Rosetta-2, SG13009 and B834- pRARE) are used by the large-scale protein production section (LSPP) to produce proteins. Protein purification processes the cell pastes generated by LSPP using IMAC methodologies and provides the pure proteins for structural analysis by NMR spectroscopy and X-ray crystallography. A Comparison of the current small-scale expression screen and the new small-scale protein expression and purification platform being implemented at CESG Stage in CESG pipeline Total Percent % Genes isolated by PCR 3,238 100 Genes cloned in expression plasmid 2,632 81 Small-scale expression positive 1,691 52 2-L large-scale protein production 1,361 42 Soluble proteins 1181 37 350 11 Protein purification and structural analysis (His)n-MBP imidazole D 3 4 5 Figure 5. SDS-PAGE analysis of purified target proteins produced on the smallscale. 1. ORF 09383 (9.2 µg by BCA) 2. ORF 19610 (23.6 µg by BCA, 11.0 µg by A280 nm) 3. ORF 19812 (32.5 µg by BCA, 16.7 µg by A280 nm) 4. ORF 34198 (6.3 µg by BCA) 5. BSA standard 10 µg target We are also using small-scale protein expression screening to evaluate the likely success of TEV cleavage. This protocol uses cell lysates without initiating IMAC based protein purification, and offers a rapid testing of the success rate of TEV cleavage of target proteins. We were able to match ~90% of the successful TEV cleavages achieved at the large pipeline-scale using previously assessed target proteins. E target (His)n-MBP target 2 TEV protease cleavage analysis of targets at the small-scale using whole cell lysates C desalted fusion protein fusion protein (His)n-MBP Rosetta-2 Rosetta-2 target target DNA: 96-well plate stored at -20 degree. 36 x cloned genes: Studier- chemical defined plates 13 C, 15N & Selenomethionine labeled proteins for purification 96 ZYP0.5G culture plates Figure 4. Chromatogram and SDS-polyacrylamide gel for the small-scale purification of mouse protein, BC026994. (A) IMAC capture of His-tagged MBP fusion protein, (B) desalting of fusion protein, (C), (D) TEV proteolysis of fusion tag isolation of target by subtractive 2nd IMAC, and (E) final desalting of target protein. B834-pRARE2 B834-pRARE2 Figure 6. SDS-PAGE gel (4-20%) of 36 targets (soluble vs. cut with TEV). The cleaved target protein can be seen in lanes labeled 8c, 9c, 10c and 11c. Summary of small-scale purification yield SESAME: GENIE module 96-well growth block TB+glycerol media ORF 1st IMAC yield after desaltinga (mg) Purity of fusion protein (%) Purity of target protein (%) 00605d 2.5 90 95 - - 0.27 - 09383 3.1 90 95 21 9.2 0.57 193 (33.9)e 14940f 4.8 90 >40 - - 0.79 - 19610 4.3 80 70g 18 23.6 0.39 198 (50.8) 19812 6.4 90 95 21 32.5 1.86 683 (36.7) 34198 5.7 70h 95 10 13.3 0.65 133 (20.5) 34255i ca. 2.0j 90 95 - - - - 4 x 24 well SDS-PAGE Figure 2. Rapid small-scale high-throughput protein expression screening method. On day 1, Rosetta-2 cells are transformed and grown on 96 culture plates. On day 2, colonies are picked and transferred into a 96-well growth block (0.4 mL of TB media), and induced for 24 hours (with 1 mM IPTG) at 25oC. To screen for positive expressing clones, the E. coli lysates are analyzed by SDS-PAGE gels on day 3. The positive expressing clones are recorded in Sesame on day 4. Large-scale growths are started with freshly transformed B834-pRARE2 cells in the following week. Rosetta-2 Rosetta-2 Final volume (µl) Final conc. (µg/ml) Expected yieldb µg DNA: 96-well plate stored at -20 degree. 36 x cloned genes: Studier- chemical defined plates 13C, 15N & Selenomethionine labeled proteins for purification 96 ZYP0.5G plates B834-pRARE2 B834-pRARE2 aFrom Final yieldc µg (%) 2 ml (8 ml cell culture) of soluble fraction. 96-well block PA0.5G media SESAME: GENIE module bEstimated Mass spec and X-ray crystallization screens yield was based upon the yield achieved in large-scale purification. cDetermined Protein purification: IMAC + TEV cleavage by BCA protein assay. dPurity 4 x 24 well SDS-PAGE 96-well block TB+G Overnight Express media of target was greater than 95%, however, protein disappeared during gel filtration (1st batch) and concentration (2nd batch). eFinal Table 1. Summary of expression screening results. B 1 Figure 3. New scheme for small-scale high-throughput protein expression screening and protein purification. On day 1, Rosetta-2 cells are transformed and grown on 96 culture plates. On day 2, single colonies are picked and transferred into a 96-well growth block (0.4 mL of PA0.5G media), and grown for 24 hours at 37oC. To screen for positive expressing clones, 20 ml of cell culture are transferred into a second growth block of autoinduction TB+glycerol Novagen Overnight express media. After growth at 25oC for 24 hours, the cells are harvested and E. coli lysates were analyzed by SDS-PAGE gels on day 4. IMAC chemistry was used to isolate the positively expressed proteins. Soluble target proteins are cleaved from their fusion partners (MBP) and re-purified, and those that were still soluble and well behaved are checked by mass spec and then sent for X-ray crystallization trials. Large-scale growths (using freshly transformed B834-pRARE2 cells) are initiated only for those targets that showed favorable results on the small-scale. yield/Expected protein yield x 100. was found in flow-through fraction after 2nd IMAC and aggregated with uncleaved fusion protein. Purification failed. Figure 7. Alternative E.coli host strains have been used to test for positive target protein expression. For example, we have been using an E. coli deletion mutant strain, MDS42-pRARE2 to screen for expression using Novagen Overnight ExpressTM Autoinduction System media (supplemented with selenomethionine). Growth was at 25oC for 40 hours and the final OD at 600 nm reached ~15. Higher levels of expression have been seen for selected target proteins in the MDS42pRARE2 E. coli strain (deduced genome with extra insertion elements and several genes deleted). Summary We are improving our protein production system to allow earlier screening of potentially suitable protein targets by expressing and purifying proteins on the small-scale. We are also developing new methodologies to increase the total numbers of protein targets that make their way to the final structural determination stages. These new protocols are been designed and implemented around a semi-automated and robotically based system, to increase the total numbers of iterations and permutations of parameters that can varied and screened. fTarget gGreen fluorescent protein. Purified target was not homogenous. Truncation occurred? Need to analyze MS data carefully. hMajor contaminant was truncation product that migrates at the size of MBP on SDSPolyacrylamide gel. Acknowledgements Cris Cowan at Promega, Madison, Wisconsin, USA. Professor William Studier at the Brookhaven National Laboratory for assistance and advice on autoinduction media and auxotrophic E. coli strain B834. iPurity 95% after gel filtration but too little amount to continue. Purification stopped after gel filtration. jYield was estimated from the SDS-Polyacrylamide gel. Dr. David A. Frisch and Scarab Genomics, LLC, Madison, Wisconsin, USA for assistance with E. coli MG1655 and MDS-42 strains. CESG is supported by the National Institute of General Medical Sciences through the Protein Structure Initiative (P50 GM 64598).