* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Large-scale identification of cytosolic mouse brain proteins by
Biochemical cascade wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Biochemistry wikipedia , lookup
Gel electrophoresis wikipedia , lookup
Point mutation wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Paracrine signalling wikipedia , lookup
Signal transduction wikipedia , lookup
Metalloprotein wikipedia , lookup
Gene expression wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Expression vector wikipedia , lookup
Magnesium transporter wikipedia , lookup
Homology modeling wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Protein structure prediction wikipedia , lookup
Interactome wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein purification wikipedia , lookup
Western blot wikipedia , lookup
Protein–protein interaction wikipedia , lookup
2799 Electrophoresis 2006, 27, 2799–2813 Joo-Ho Shin1, Kurt Krapfenbauer1, 2 Gert Lubec1 1 Department of Pediatrics, Medical University of Vienna, Vienna, Austria 2 Genomics and Proteomics Department, CNS Preclinical Research, F. Hoffmann La Roche, Basel, Switzerland Received October 27, 2005 Revised March 6, 2006 Accepted March 8, 2006 Research Article Large-scale identification of cytosolic mouse brain proteins by chromatographic prefractionation Proteomic studies on mouse brain protein expression are still holding center stage as the generation of a reference database for the brain proteome, a need for designing expressional studies at the protein level. We therefore decided to extend the amount of identified brain proteins by the use of prefractionation. In order to reduce the complexity of mouse brain proteome we applied chromatographic prefractionations, ionexchange and hydrophobic interaction chromatography, prior to 2-DE, followed by mass spectrometric identification (2-DE MALDI-MS). We analyzed about 17 000 protein spots in cytosolic fractions of mouse brain and identified about 10 000 spots. A total of 1841 proteins showing different pI or Mr, representing probably post-translational modifications or splice variants, were products of 789 different genes. Numerous proteins were clearly identified as metabolic, antioxidant, cytoskeleton, signaling, transcription/translation, nucleic acid-binding, proteolysis-related proteins. We additionally provided evidence for the existence of hypothetical proteins predicted from nucleic acid sequences. Moreover, observed pIs of proteins are listed thus enabling localization of proteins in a gel, information that cannot be obtained from theoretical pI’s in databases. The results represent so far the largest database of mouse brain proteins and provide valuable information for the design of proteomic studies in the mouse. Keywords: Hydrophobic interaction chromatography / Ion-exchange chromatography / Mouse brain / Proteome map / Two-dimensional gel electrophoresis DOI 10.1002/elps.200500804 1 Introduction Initial mouse genome sequences were released in 2002 [1] and subsequent comparative genomics and extensive experiments have been updating gene numbers and transcriptomes in specific tissues and at different develop- Correspondence: Professor Dr. Gert Lubec, Department of Pediatrics, Medical University of Vienna, Währinger Gürtel 18–20, A-1090 Vienna, Austria E-mail: [email protected] Fax: 143-1-40400-3194 Abbreviations: CDD, conserved domain database; COG, clusters of orthologous groups of proteins database; DIP, database of interacting protein; DTE, dithioerythritol; HIC, hydrophobic interaction chromatography; HPs, hypothetical proteins; IEC, ion-exchange chromatography; PPI, protein–protein interaction; SMART, simple modular architecture research tool © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim mental stages. The mouse genome contains about 30 000 genes, of which two-thirds are thought to be expressed in the brain. To date 32 417 genes were submitted to Mouse Genome Database (MGD, http://www.informatics.jax.org) [2], 1930 genes are listed in the expression atlas (at the RNA level) (http://brainatlas.org) and 1274 full-ORFs of transcripts expressed in the developing mouse nervous system were identified by the National Institutes of Health (NIH) Mouse Brain Molecular Anatomy Project (BMAP) [3]. The gene map will be only the first step in mapping the mouse, and high-throughput mRNA approaches for expression profiling were applied due to technical reasons instead of protein profiling. First initiatives were started to determine mouse proteome using immunochemistry and protein chemistry. The proteome encompasses all proteins that are expressed in a biological system (cell, tissue, organ) at any given time. The emerging field of proteomics focuses on the large-scale, comprehensive characterization of proteome. www.electrophoresis-journal.com 2800 J.-H. Shin et al. The experimental procedure would include separating and presenting all of the different proteins of an organism, detecting protein functions by characterizing each protein according to a broad spectrum of chemical and biological parameters, and matching each protein with its gene [4]. Ultimately, proteomics was recognized as one of the main directions of science in the postgenome era and protein expression in all mammalian organs/cells will be mapped. For this purpose the current analytical technologies and methods will have to be improved and indeed a large series of proteomic publications are identifying the same proteins again and again. The many limitations in proteomic techniques including separation and identification of hydrophobic and membrane proteins, very basic and very acidic proteins, proteins of high molecular weight, etc., are hampering construction of large maps [5, 6]. Therefore, prefractionation is mandatory to cope with the complexity of brain proteins. Prefractionation of cellular compartments is a useful principle [7] and chromatographic prefractionation is another option [8–14]. The principle of the chromatographic step for prefractionation is to apply a third dimension in protein separation, in addition to the ones used in 2-DE, pI, and Mr and another objective is to allow loading of a larger amount of fraction of the proteins and to confine highly abundant proteins to one of the fractions [11]. It was the aim of this study to generate a mouse brain protein reference database by the use of DEAE-ion exchange chromatography (IEC) and hydrophobic interaction chromatography (HIC) followed by 2-DE MALDI-MS analysis, independent of antibody availability and specificity. This database may serve for the design of protein expression studies in mouse brain identifying 1841 proteins that were the products of 789 different genes covering several protein classes and cascades. The many described proteins can be easily found on the maps and the neurochemist is given the analytical information and is not left to find experimental conditions for the proteomic analysis. Electrophoresis 2006, 27, 2799–2813 2.2 Sample preparation Four whole brain tissues were individually suspended in sucrose buffer consisting of 20 mM HEPES (pH 7.5), 320 mM sucrose, 1 mM EDTA, 5 mM dithioerythritol (DTE), and 1 mg/mL of a mixture of protease inhibitors (1 mM PMSF and 1 tablet Complete™ (Art. No 1697498, Boehringer Mannheim) per 50 mL of suspension buffer and phosphatase inhibitors (0.2 mM Na3VO3, and 1 mM NaF)). The suspension was homogenized with ten strokes using a glass/Teflon homogenizer after Potter. The suspension was centrifuged at 8006g for 10 min at 47C to sediment nonsuspended material. The supernatant was then taken for further centrifugation at 10 0006g for 15 min to obtain the enriched mitochondrial fraction, centrifuged at 100 0006g for 1 h to generate a pellet containing the enriched microsomal fraction and the supernatant representing the cytosolic fraction. The supernatants were centrifuged at 100 0006g for 1 h to sediment undissolved material. Cytosolic proteins from four mouse brains were desalted and concentrated with HiTrap™ Desalting (Amersham Biosciences). The protein concentration in the supernatant was determined by the Bradford reaction [15]. 2.3 IEC Proteins were applied onto an Anion-Exchanger column (Art. Nr. 08802, DEAE, 5PW-Glass, 8.0 mm id67.5 cm.L, Tosoh Biosep), equilibrated with 25 mM Tris buffer pH 8.8. The column was washed twice with 20 mL of the same buffer and 8 mg of total proteins from the cytosol (dissolved in a sample buffer, pH 8.8, consisting: 25 mM Tris buffer at pH 8.8) were eluted with a linear gradient of increasing salt concentration from 0 to 2 M NaCl in 25 mM Tris-HCl buffer (pH 8.8). Fractions of 0.5 mL were collected and pooled according to the elution profile. Six pools (DFT, D1,4, and DLF) were formed. Proteins were concentrated, desalted by using RP chromatography (according to the manufacturer’s protocol Art. Nr. 1–1159–05, Poros50 R2, Applied Biosystem) and analyzed by IEF and 2-D PAGE. 2 Materials and methods 2.4 HIC 2.1 Animals Four FVB/N mice, about 3 months old, kept in a well-controlled (humidity, temperature, and light/dark cycle) environment with free access to food and water ad libitum, were used for the experiments. Animals were sacrificed by decapitation and whole brains were dissected at 2207C and kept at 2807C until the time of analysis. The freezing chain was never interrupted until use. Each mouse brain was used independently for all the experiments and the procedures were performed four times in all. © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim Proteins were applied onto HIC columns (Tosoh Biosep: Analytical/Semi-Prep column, TSK-Gel Phenyl-5PW, Art. 07573, 7.5 cm67.5 mm id, 10 mm), equilibrated with 25 mM Na2HPO4, (pH 7.0; 1 mM EDTA; 0.5 mM DTE; 16 Complete). The column was washed twice with 20 mL of the same buffer (standard flow rate 0.5–1.0 mL/min, max flow rate 1.2 mL/min, pH 7.0) and 8 mg total proteins from the cytosol (dissolved in a 25 mM Na2HPO4 sample buffer) were eluted with a linear gradient of decreasing salt concentration from 2 to 0 M ammonium sulfate solved in www.electrophoresis-journal.com Electrophoresis 2006, 27, 2799–2813 25 mM Na2HPO4, (pH 7.0; 1 mM EDTA; 0.5 mM DTE; 16 Complete). Fractions of 0.5 mL were collected and pooled according to the elution profile. Five pools (HFT, H1,3, and HLF) were formed. The proteins were concentrated and desalted by using RP chromatography according to the manufacture’s protocol (Art. Nr. 1–1159–05, Poros50 R2, Applied Biosystem) and analyzed by IEF and 2-D PAGE described above. 2.5 2-DE Eleven fractions from IEC (six pools) and HIC (five pools) were desalted by using membrane filter tubes (Art. No. UFV4BGC25, 10 000 NMWL, Biomax-10 membrane, Millipore, Bedford, MA, USA) and 1.0 mg aliquots were applied on immobilized pH 3–10 nonlinear gradient strips (Amersham Pharmacia Biotechnology, Uppsala, Sweden) at both the basic and acidic ends of the strips. Proteins were focused at 200 V after which the voltage was gradually increased to 5000 V at the rate of 2 V/min (approximately, 180 000 kVh). Focussing was continued at 5000 V for 24 h. The second-dimensional separation was performed on 12% homogeneous polyacrylamide gels (Serva, Heidelberg, Germany). The gels (180620061.5 mm) were run at 40 mA per gel, in an ISODALT apparatus (Hoefer Scientific Instruments, San Francisco, CA, USA) accommodating ten gels. After protein fixation with 50% v/v methanol containing 5% v/v phosphoric acid for 12 h, the gels were stained with colloidal CBB (Novex, San Diego, CA, USA) for 24 h. Molecular masses were determined by running standard protein markers (Gibco, Basel, Switzerland), covering the range of 10–200 kDa. pI values were used as given by supplier of the IPG strips. Gels were destained with H2O and scanned with ImageScanner (Amersham Biosciences, Uppsala, Sweden). Images were processed using Photoshop (Adobe) and PowerPoint (Microsoft) software. 2.6 MALDI-TOF MS MALDI-TOF MS analysis was performed as described in [16] with minor modifications. Briefly, spots were excised with a spot picker, placed into 96-well microtiter plates, destained with 100 mL of 30% v/v ACN in 0.1 M ammonium bicarbonate and dried in a Speedvac evaporator. The dried gel pieces were rehydrated with 5 mL of 3 mM TrisHCl (pH 8.8), containing 50 ng of trypsin (Promega, Madison, WI, USA). After 16 h digestion at room temperature (RT), 5 mL of water was added and samples were kept at RT for 10 min. Four microliters of 50% ACN, containing 0.3% TFA was added and the content was centrifuged for 1 min and vortexed for 20 min. The standard peptides desArg-bradykinin (Sigma, St. Louis, MO, USA; 904.4681 Da) © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim Proteomics and 2-DE 2801 and adrenocorticotropic hormone fragment 18–39 (Sigma; 2465.1989 Da) were added. The application of the samples was performed by a CyBi™-Well sample processor (CyBio AG) 1.5 mL of the peptide mixture was simultaneously applied with 1 mL of matrix, consisting of a saturated solution of a-cyano-4-hydroxycinnamic acid in 50% ACN containing 0.1% TFA. The samples were analyzed by a TOF Reflex III™ mass spectrometer (Bruker Daltonics, Bremen, Germany) equipped with a reflector and delayed extraction. An accelerating voltage of 20 kV was used. The protein search was performed with in-house software, developed by Hoffmann LaRoche (Basel, Switzerland), which is similar to the Peptident software on the ExPASy server (http://expasy.hcuge.ch/sprot/peptident.html) [17]. Monoisotopic peptide masses were compared to the theoretical peptide masses of all available proteins from all species from Swiss-Prot (http://www.expasy.ch) and PIR (http:// pir.georgetown.edu/) databases. A mass tolerance of 0.0025% was allowed. Unmatched peptides or miscleavage sites were not considered. Spectra were analyzed and protein sequence databases were searched using the in-house programs Fragment 21 and MSROFIT, respectively (Roche, Basel, Switzerland). The algorithm used for the in-house software determining the probability of a assignment (pMism, 210log(P), where P is the probability that the match is a random event) with a given MS spectrum was published and this scoring function resulted in ca. 0.02% false-positive hits [17]. The algorithm for detection and evaluation of peptide mass peaks in MS spectra is described below. 2.6.1 Baseline correction The baseline of the MALDI-MS spectrum was found by splitting the spectrum into sequential mass segments of a 0.05 mass range. For each of these segments we calculated a robust linear fit [18] and derived the slope and offset and their respective errors. The baseline was then approximated by cubic spline interpolation in-between the midpoints of the segments, and the baseline was subtracted from the spectrum. 2.6.2 Peak detection and isotope distribution fit After baseline correction, the maximum y-coordinate was taken as the starting point for peak fitting. A standard implementation of the Levenberg–Marquardt algorithm [18] was used to fit the isotope distribution of an average peptide at a given mass (calculated using the algorithm of Rockwood et al. [19]) which is parametrisized by the monoisotopic mass position, the instrument resolution, and the peak height. The fit was characterized by the usual fit quality estimates (chi-square statistics). www.electrophoresis-journal.com 2802 J.-H. Shin et al. 2.6.3 Subtraction of fit From the fitted isotope distribution parameters we calculated the fit isotope distribution multiplied by a safety margin of 1.2 and subtracted the fit from the spectrum. The procedure was restarted and looped until the desired number of monoisotopic massed has been found. The algorithm was implemented on a standard personal computer. 2.7 Computational analysis of hypothetical proteins Based on MALDI-MS analysis, we identified hypothetical proteins (HPs) and their sequences were compared against the Swiss-Prot database (http://www.expasy.org/ sprot) [20], the conserved domain database (CDD) (http:// www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml) [21], Electrophoresis 2006, 27, 2799–2813 clusters of orthologous groups of proteins database (COG) (http://www.ncbi.nlm.nih.gov/COG) [22] using their respective search tools, simple modular architecture research tool (SMART) (http://smart.embl-heidelberg.de) [23], protein families database (Pfam) (http://pfam.cgb.ki.se) [24], and against the NCBI protein database (http:// www.ncbi.nlm.nih.gov) [25]. STRING (http://www. bork.embl-heidelberg.de/STRING) [26] provided clues for functional prediction of HPs based on direct (physical) and indirect (functional) associations derived from genomic context, high-throughput experiments, conserved coexpression and previous knowledge. Protein–protein interaction (PPI) data were obtained from the Database of Interacting Proteins (DIP) (http://dip.doe-mbi.ucla.edu) [27]. Finally, the recently enabled global search of all NCBI database (http://www.ncbi.nlm.nih.gov/Entrez) was performed to ensure the complete retrieval of all the available information. All findings were reassessed manually. Figure 1. Scheme of procedure. Mouse brain cytosolic protein was extracted and fractionated by ultracentrifugation followed by chromatographic prefractionation. Each fraction was used for 2-DE as described in Section 2. © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.electrophoresis-journal.com Proteomics and 2-DE Electrophoresis 2006, 27, 2799–2813 3 Results 2803 Identification rate was about 57.4% (9613/16 741 spots) of the analyzed spots. Nonidentification may have been due to protein modifications, miscleavages, missing entries in databases, errors in databases, unknown genes, or overlapping spots that did not allow confident identification. 3.1 Large-scale identification of mouse brain proteins Total protein from mouse brain was separated into three enriched fractions (microsome, mitochondria, and cytosol) by ultracentrifugation (Fig. 1). Due to insufficient amount of microsomal and mitochondrial proteins for further chromatographic prefractionations, these fractions were kept and only the cytosolic fraction of the brain was used for chromatographic prefractionation. Following chromatography (Fig. 2a and b), fractions were desalted and applied over a wide pH range 3–10 NL IPG strips. 2-DE separation was performed and protein spots were visualized by staining with colloidal CBB (Fig. 2c and d, see Supplementary Figure). The spots in 2-DE gels were selected randomly with the goal of detecting as many new proteins as possible and PMF was used to identify visualized spots. As summarized in Table 1, about 17 000 spots (IEC: 12 600 spots; HIC: 4400 spots) in 42 gels generated from individual experiments were analyzed and resulted in the identification of about 10 000 proteins. The fact that numerous identified proteins were overlapping in between fractions led to the detection of 1841 proteins showing different pI or Mr, probably due to post-translational modifications or presence of splice variants, which were the products of 789 different genes (see Supplementary Table 1). 31.3% (247 of 789) of the identified proteins in mouse brain were classified as metabolism enzymes or its subunits. Chromatographic prefractionation enabled detection of low-abundant proteins (LAPs) such as signaling components and, therefore, signaling proteins (136 of 789, 17.2%) were observed as the second largest class. Other proteins identified were assigned as cytoskeleton proteins (74 of 789, 9.4%), chaperone proteins (41 of 789, 5.2%), nucleicacid binding proteins (including transcriptional/translational factors) (46 of 789, 5.8%), antioxidant proteins (20 of 789, 2.5%), proteolysis components (47 of 789, 6.0%), neuronal proteins (13 of 789, 1.6%), and miscellaneous proteins (136 of 789, 17.2%) (Fig. 3a). In addition, representative proteins associating with neuronal or glial function were classified (see Supplementary Table 2). In addition, we investigated the localization of genes encoding the 789 different proteins. At least 19 genes from X chromosome (2.4%, 19/789) were expressed in mouse brain and genes on chromosome 11 were highly expressed (9.6%, 76/789) (Fig. 3b). Furthermore, theoretical values of pI and Mr were compared with observed values. Scatter-plot analysis showed the confident correlation between pI/Mr values (coefficient R2 = 0.72 and 0.89, respectively) (Figs. 3c and d). The existence of proteins represented by several spots Table 1. Summary of spot numbers analyzed and identified in this study Contents Used micea) F1 M1 M2 M3 Number of gels IEC fraction HIC fraction 6 3 4 777 2 714 56.8 6 5 4 621 2 662 57.6 6 5 4 498 2 581 57.4 6 5 2 845 1 656 58.2 16 741 9 613 2 403 789 1 841 2.3 Number of spots analyzed Number of spots identified Identification rate (%) Total number of spots analyzed Total number of spots identified Identified spots per mouse Total number of genes encoding proteins Total number of spot showing different pI or Mr Number of putative isoform, post-translational modified, or fragmented spot per protein Proteins fractionated by chromatography were separated by 2-DE and identified by MALDI-TOF MS, following in-gel digestion with trypsin. The spots representing the identified proteins are indicated in Supplementary Figure 1 and are designated with their accession numbers (http://us.expasy.org/). IEC, ion exchange chromatography; HIC, hydrophobic interaction chromatography. a) F1, Female mouse number 1; M1–3, Male mouse number 1–3. © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.electrophoresis-journal.com 2804 J.-H. Shin et al. Electrophoresis 2006, 27, 2799–2813 Figure 2. Representative elution profile of eluted proteins from the IEC (a) and HIC (b) columns. Soluble protein fractions were applied onto IEC and HIC column and proteins were eluted as described under “experimental protocol.” Parts of the elution profiles which include protein peaks are shown. Numbers below the peaks indicate the pools formed and followed by 2-DE. Under monitoring at 280 nm UV detector, fraction DFT,DLT (DEAEchromatography) (c) and HFT,HLF (HIC-chromatography) (d) were eluted with a increasing and decreasing salt concentration, respectively; FT, flow through; LF, last fraction. (e) All visualized spots were analyzed and 789 different gene products were identified (see Supplementary Table 1 and Supplementary Figures). Most proteins were predominantly identified in IEC fraction and only 48 proteins were exclusively identified in HIC. © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.electrophoresis-journal.com Electrophoresis 2006, 27, 2799–2813 Proteomics and 2-DE 2805 Table 2. Computational analysis and putative functions of HPs identified in mouse brain Accession number and protein name (gene) UniGene Clustera) (Protein similarities%/number of amino acid) Putative domainsb) PSI-BLASTc) search (% identity to functionally known protein) Cytoskeleton-related proteins Q8BRR3 A. thaliana ref:NP_567496.1 – LET1 like protein (41.4/476) Unknown EST, full H. sapiens ref:NP_071938.1 – HP FLJ21988 (87.4/476) insert sequence (Narfl) S. cerevisiae sp:P23503 – Nuclear architecture related protein 1 (31.5/417) Pfam02906, 151% identity to Fe_hyd_lg_C, NARF (nuclear iron only hydro- prelamin A recognigenase large tion factor) protein subunit, (H. sapiens) C-terminal domain. Q922H6 HP BC008103 (BC008103) A. thaliana ref:NP_195436.1 – tubulin-like protein (28.7/548) D. melanogaster ref:NP_523435.1 – misato; lethal (29.8/549) H. sapiens ref:NP_060586.1 – misato (H. sapiens) (78.2/554) S. cerevisiae pir:S55093 – HP YMR211w (24.4/242) Q9WV92 Band 4.1-like protein 3 (4.1b) (Epb41l3) CD00836, C. elegans ref:NP_493600.1 – Band 4.1 protein like (36.34/418) FERM_C D. melanogaster pir:T13800 – T13800 coracle gene protein (59.4/356) M. musculus ref:NP_038841.1 – erythrocyte protein band 4.1-like 3; DAL1P (88.1/510) R. norvegicus ref:NP_446379.1 – erythrocyte protein band 4.1-like 3 (85.7/1087) CD00286, Tubulin/FtsZ, Tubulin/FtsZ / Cytoplasm 28% identity to tubulin-like protein 48% identity to neuronal protein 4.1 (M. musculus) Neuronal protein 4.1 is necessary for nuclear assembly Detoxification-related proteins Q8BZA2 A. thaliana pir:S57611 – S57611 probable quin oxidoreductase (34.6/322) Hypothetical zincC. elegans ref:NP_496334.1 – Zinc-binding dehydrogenases (45.3/346) containing alcohol E. coli pir:D64897 – D64897 probable quinone oxidoreductase (37.8/345) dehydrogenase super- M. musculus ref:NP_080244.1 – RIKEN cDNA 2510002C21 (40.9/322) family containing S. cerevisiae pir:S58197 – S58197 probable membrane protein YML131w protein, full insert (26.0/324) sequence (9630043F13Rik) Pfam00107, ADH_zinc_N, zinc-binding dehydrogenase. 45% identity to zinc-binding dehydrogenase (C. elegans) Q8CHP8 RIKEN cDNA 1700012G19 (1700012G19Rik) A. thaliana ref:NP_199587.1 – 4-nitrophenylphosphatase-like protein (35.9/223) C. elegans ref:NP_504511.1 – C53A3.2.p (34.6/204) E. coli ref:NP_286389.1 – N-acetylglucosamine metabolism (33.2/189) H. sapiens ref:NP_064711.1 – HP dJ37E16.5 (46.2/223) S. cerevisiae pir:S67800 – S67800 aryl phosphatase (28.9/218) Pfam00702, hydrolase, haloacid dehalogenase-like hydrolase 45% identity to pyridoxal phosphate phosphatase (M. musculus) Q9CYW4 http://ca.expasy.org/ sprot/userman.html – DE_line Hypothetical Haloacid dehalogenase/epoxide hydrolase family containing protein, full insert sequence (bn189g18.4.1) (2810435D12) A. thaliana ref:NP_199286.1 – Dreg-2 like protein (33/206) C. elegans pir:T23197 T23197 HP K01G5.1 (28/208) D. melanogaster sp:Q94915 – Rhythmically expressed gene 2 protein (28/229) H. sapiens ref:NP_112496.1 – HP MGC12904 (77/249) S. cerevisiae pir:S53060 – HP YMR130w (27/216) Pfam00702, Hydrolase, haloacid dehalogenase-like hydrolase 27% identity to 2-haloalkanoic acid dehalogenase Q9DCF2 Hypothetical HAD-like structure containing protein, full insert sequence (0610039H12) A. thaliana ref:NP_199587.1 – 4-nitrophenylphosphatase-like protein (25.8/216) C. elegans ref:NP_504597.1 – K08B12.3.p (51.6/251) E. coli ref:NP_286389.1 – N-acetylglucosamine metabolism (24.3/220) H. sapiens ref:NP_115500.1 – HP DKFZp564D1378 (89.2/259) Pfam00702, hydrolase, haloacid dehalogenase-like hydrolase 38% identity to phospholysine phosphohistidine inorganic pyrophosphate phosphatase © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.electrophoresis-journal.com 2806 J.-H. Shin et al. Electrophoresis 2006, 27, 2799–2813 Table 2. Continued Accession number and protein name (gene) UniGene Clustera) (Protein similarities%/number of amino acid) Putative domainsb) PSI-BLASTc) search (% identity to functionally known protein) Metabolism-related proteins Q6KAP8 MFLJ00216 protein [Fragment] (mFLJ00216) A. thaliana ref:NP_181047.1 – putative UDP-N-acetylglucosamine pyrophosphorylase (40.4/482) C. elegans pir:T19764 – T19764 HP C36A4.4 (36.7/466) H. sapiens sp:Q16222 – UDP-N-acetylhexosamine pyrophosphorylase (55.5/492) S. cerevisiae pir:S50738 – QRI1 protein (38.0/477) Pfam01704, 51% identity to UDPN-acteylglucosamine UDPGP, UTP-glucose-1- pyrophosphorylase 1 phosphate uridylyl- (H. sapiens) transferase Q8BGB7 9330118I12 product:E-1 ENZYME homolog (9330118I12Rik) C. elegans ref:NP_505997.1 – E-1 enzyme (38.9/267) H. sapiens ref:NP_067027.1 – E-1 enzyme (100/261) S. cerevisiae pir:S30843 – UTR4 protein (30.7/258) COG4229, 39% identity to COG4229, preenolase-phosphatase dicted enolase(5M415) (C. elegans) phosphatase [Energy production and conversion] Q80TB3 MKIAA1612 protein [Fragment] (4930415J21Rik) C. elegans ref:NP_495088.1 – C17G10.1.p (28.4/261) H. sapiens ref:NP_060703.1 – HP FLJ10826 (73.3/362) S. cerevisiae pir:S50552 – HP YER049w (32.0/150) Smart00702, P4Hc, prolyl 4-hydroxylase alpha subunit homologs Q8BK10 Weakly similar to Putative N-acetylglucosamine-6-phosphate deacetylase (5730457F11Rik) C. elegans ref:NP_498990.1 – N-acetyl-glucosamine-6-phosphate deacetylase (52.4/395) E. coli ref:NP_312042.1 – N-acetylgalactosamine-6-phosphate deacetylase (36.6/381) CD00854, NagA, 37% identity to N-acetylglucosa- CaNAG2 mine-6-phosphate (Candida albicans) deacetylase Q8CAA7 PMMLP homolog (Pgm2l1) C. elegans ref:NP_499741.1 – Phosphoglucomutase and phosphomannomutase phosphoserine (44.0/599) H. sapiens ref:NP_060760.1 – HP FLJ10983 (60.2/592) S. cerevisiae pir:S54585 – HP YMR278w (39.8/563) Pfam02878 or 60% identity to 02879, phosphoglucomutase PGM_PMM, phos- 2 (H. sapiens) phoglucomutase/ phosphomannomutase, alpha/ beta/alpha domain Q8VCR7 CCG1-interacting factor B (Cib) C. elegans ref:NP_500027.1 – Y55F3AM.10.p (34.7/192) H. sapiens ref:NP_116139.1 – HP MGC15429 (88.5/208) COG0596, MhpC, 47% identity to Dorz1 predicted hydro- (M. musculus) lases or acyltransferases Q8VED9 RIKEN cDNA 1110067D22 (1110067D22Rik) C. elegans ref:NP_501571.1 – galactoside-binding lectin like (27.7/134) H. sapiens ref:NP_054900.1 – HSPC159 protein (98.8/172) M. musculus sp:O54891 – Galectin-6 (37.1/132) R. norvegicus pir:A55932 – Galectin-5 (35.5/137) CD00070, GLECT, 35% identity to galectin/galacto- galectin-8 se-binding lectin (H. sapiens) Q9DB32 Unknown EST, full insert sequence (Haghl) A. thaliana ref:NP_187696.1 – hydroxyacylglutathione hydrolase cytoplasmic (41.5/256) C. elegans ref:NP_496556.1 – Metallo-beta-lactamase superfamily (42.8/219) E. coli ref:NP_285900.1 – probable hydroxyacylglutathione hydrolase (35.9/230) H. sapiens ref:NP_115680.1 – HP MGC2605 (82.4/278) M. musculus ref:NP_064383.1 – Brain protein 17 (42.3/226) R. norvegicus ref:NP_203500.1 – Hydroxyacyl glutathione hydrolase; glyoxalase II; round spermatid protein RSP29 (49.2/256) S. cerevisiae pir:S70130 – HP YDR272w (29.4/226) Pfam00753, lactamase_B, metallo-betalactamase superfamily COG0491, GloB, Zn-dependent hydrolases, including glyoxylases © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 30% identity to Q822Y6, glucose-6-phosphate 1-dehydrogenase 50% identity to hydroxyacyl-glutathione hydrolase (glyoxalase II) (M. musculus) www.electrophoresis-journal.com Electrophoresis 2006, 27, 2799–2813 Proteomics and 2-DE 2807 Table 2. Continued Accession number and protein name (gene) UniGene Clustera) (Protein similarities%/number of amino acid) Proteolysis-related proteins M. musculus ref:NP_077795.1 – HP, MGC: 7513; HP MGC7513 (100/103) Q8BGR9 Weakly similar to Putative ubiquitin carboxyl-terminal hydrolase C6G9.08 (BC002236) Q9CUT0 H. sapiens ref:NP_078859.1 – HP FLJ23142 (77/217) HP, full insert sequence. [Fragment] (4833415E20Rik) Transcrition/translation-related proteins A. thaliana ref:NP_176303.1 – HP (30.3/220) Q8BNW9 C. elegans ref:NP_491322.1 – R12E2.1.p (33.7/160) Protein KIAA0711 D. melanogaster pir:A45773 – A45773 kelch protein, long form (31.2/187) (Kiaa0711) M. musculus sp:P11087 – Collagen alpha 1 (26.4/670) R. norvegicus ref:NP_037061.1 – Procollagen II alpha 1 (26.7/673) S. cerevisiae pir:S48478 – Glucan 1,4-alpha –glucosidase (21.3/462) H. sapiens ref:NP_065728.1 – HSCARG protein (82.7/306) Q8BVF0 Similar to HSCARG (M. musculus 10 days neonate cerebellum cDNA, RIKEN fulllength enriched library, clone:6530440K01 product) (UN) Putative domainsb) PSI-BLASTc) search (% identity to functionally known protein) CD00196, UBQ, ubiquitin homologs Smart00577, CPDc, catalytic domain of CTDlike phosphatases; 36% identity to ubiquitin-specific protease 14 (tRNA-guanine transglycosylase) [Xenopus tropicalis] Pfam03577, peptidase_U34, peptidase family U34. 52% identity to secernin 1 (H. sapiens) Smart00225, BTB, 32% identity to DRE1 Broad-Complex, protein (H. sapiens) Tramtrack and Bric a brac; also known as POZ (poxvirus and zinc finger) domain. Smart00612, Kelch, Kelch domain. Pfam05368, NmrA, NmrA-like family 30% identity to NADPH-dependent reductase Q8CDF9 Hypothetical NOL1/ NOP2/SUN family containing protein, full insert sequence (4932443I04 Rik) A. thaliana pir:T06106 – HP T5J1 7 170 (32.9/655) C. elegans pir:T33803 – HP W07E6.1 (23.2/437) E. coli sp:P76273 – HP yebU (27.2/291) H. sapiens sp:P46087 – Proliferating-cell nucleolar antigen P120 (23.6/456) M. musculus pir:A48998 – Nucleolar protein p120 (22.9/499) S. cerevisiae sp:P38205 – Putative methyltransferase NCL1 (35.4/716) Pfam01189, Nol1_Nop2_Sun, NOL1/NOP2/sun family 24% identity to Proliferating-cell nucleolar antigen P120 (H. sapiens) Q9CQT1 Hypothetical Initiation factor 2B containing protein, full insert sequence (2410018C20 Rik) A. thaliana ref:NP_027726.1 – Putative translation initiation factor eIF-2B alpha subunit (58.3/345) C. elegans ref:NP_506714.1 – Initiation factor 2 subunit (41.5/347) D. melanogaster ref:NP_570020.1 – eIF2B-beta gene product (24/273) H. sapiens pir:T08757 – Probable translation initiation factor eIF-2B delta chain – (25.4/269) Pfam01008, IF-2B, initiation factor 2 subunit family 42% identity to translation initiation factor (40.9 kDa) (5P563) (C. elegans) Signaling-related proteins CD00197, VHS, 69% identity to TOM1 A. thaliana ref:NP_564138.1 – Expressed protein (32.5/288) Q8C935 VHS domain has a (H. sapiens) C. elegans ref:NP_508777.1 – C07A12.7a.p (47.1/355) Inferred: TOM1 D. melanogaster pdb:1DVP – A Chain A, Crystal Structure Of The VHS And FYVE superhelical (M. musculus), full insert sequence Tandem Domains Of Hrs, A Protein Involved In Membrane Trafficking And Signal structure similar (Tom1l2) Transduction (35.4/143) H. sapiens ref:NP_005479.1 – Target of myb1 (61.7/507) to structure of ARM repeats M. musculus ref:NP_035752.1 – Target of myb1 homolog (60.9/507) R. norvegicus ref:NP_062260.1 – Hrs (36.0/136) S. cerevisiae pir:S48950 – HP YHR108w (24.0/379) © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.electrophoresis-journal.com 2808 J.-H. Shin et al. Electrophoresis 2006, 27, 2799–2813 Table 2. Continued Accession number and protein name (gene) UniGene Clustera) (Protein similarities%/number of amino acid) Q9D934 Similar to PEFLIN (2600002E23Rik) CD00051, EFh, 35% identity to sorcin A. thaliana ref:NP_180317.1 – putative calcium binding protein (31.0/166) EF-hand, calcium (M. musculus) C. elegans ref:NP_491447.1 – Calcium binding protein (46.5/142) binding motif D. melanogaster ref:NP_477047.1 – CalpA-P1 (27.1/129) H. sapiens ref:NP_036524.1 – Peflin (84.2/275) M. musculus pdb:1HQV – A Chain A, Structure Of Apoptosis-Linked Protein Alg-2 (39.6/183) R. norvegicus pir:S38361 – Calpain (29.5/129) S. cerevisiae sp:P53238 – Hypothetical 38.4 kDa protein in MUP1-SPR3 intergenic region (25.5/157) No putative function Q6LCE3, WF-3 (UN) H. sapiens ref:NP_113638.1 – HP p5326 (96.8/251) Q7TNC7 FKSG27 (AF322649) C. elegans ref:NP_491232.1 – F53F10.5.(25.5/523) E. coli ref:NP_415890.1 – putative membrane protein (22.07%/434 aa) H. sapiens ref:NP_062558.1 – HP R30953_1 (28.1/398) M. musculus sp:P05143 – prolin-rich protein MP-3 (31.4/183) Q8BHQ7 Similar to Hypothetical 34.0 kDa protein (1810007P19Rik) C. elegans ref:NP_497640.1 – R10F2.5.p (39.4/190) Q8BZE2 H. sapiens pir:T08798 – HP DKFZp586B0923.1 – (96.1/308) http://us.expasy.org/ sprot/userman.html – DE_line Hypothetical Cysteine-rich region containing protein, full insert sequence (2510003E04Rik) Q8BWR2 AD039 (1110049F12Rik) A. thaliana ref:NP_565614.1 – expressed protein (37.6/194) C. elegans pir:S44654 – ZK353.1 protein (48.2/190) H. sapiens sp:O43396 – thioredoxin-like protein (31.9/134 R. norvegicus ref:NP_543163.1 – thioredoxin-like (33.3/134) Q9DCS2 H. sapiens ref:NP_115742.1 – HP MGC13114 (86.4/66) Hypothetical S-adenosyl-lmethionine-dependent methyltransferases structure containing protein, full insert sequence (0610011F06Rik) Putative domainsb) PSI-BLASTc) search (% identity to functionally known protein) No putative conserved domains No putative conserved domains ND No putative conserved domains 37% identity to, DNA mismatch repair protein hexB (P14160) No putative conserved domains ND 28% identity to interferon-inducible GTPase 5 (H. sapiens) Pfam06201, 32% identity to DUF1000, Domain thioredoxin-like of Unknown Func- protein (H. sapiens) tion (DUF1000). Pfam06080, 49% identity to SAMDUF938, dependent methylProtein of transferases unknown function (DUF938) a) Comparison of sequences in UniGene (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=unigene) with proteins supported by a complete genome. The alignments can suggest function of a gene. The nucleotide sequences in UniGene are matched with possible translational products through sequence comparison using BLASTX with structural databases Swiss-Prot, PIR, PDB, or PRF (ProtEST, Protein matches for ESTs). b) Accession number of conserved domain database (CDD, http://www.ncbi.nlm.nih.gov/Structure/cdd/), clusters of orthologous groups of protein database (COG, http://www.ncbi.nlm.nih.gov/COG), simple modular architecture research tool (SMART, http://smart.embl-heidelberg.de/) or pfam-database (http://www.sanger.ac.uk/pfam). c) PSI-BLAST was used in order to be able to arrange their function, without that they are even annotated in any database. ND, not detected. © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.electrophoresis-journal.com Electrophoresis 2006, 27, 2799–2813 Proteomics and 2-DE 2809 Figure 3. Distribution of proteins identified considering their function (a), chromosomal localization (b). Metabolic and signaling proteins accounted for a main portion (48.5%) and interestingly 76 proteins were encoded by genes localized on chromosome 11 (76/789, 9.6%). (c) Scatter plot of theoretical pI against observed pI in the identified protein with at least 1700 aligning values. Black line is the linear regression line (R2 = 0.72). Linear regression was performed with SigmaPlot (Windows version 4.00, SPSS Inc). (d) Scatter plot of theoretical Mr against observed Mr in the identified protein with at least 1300 aligning values. Black line is the linear regression line (R2 = 0.89). showing different pI accounted for the lower coefficient for pI than Mr. Herein the observed pIs are measured thus making possible the localization of the identified proteins within the gel, a prerequisite for the use of this proteomic approach as a valuable analytical tool. 3.2 Chromatographic prefractionation Protein mixtures were fractionated by IEC because the ion-exchanger has a high-protein-binding capacity and can discriminate proteins with minor difference in their pI values. In addition, HIC was used to increase the oppor© 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim tunity for the separation of biological macromolecules based on their surface hydrophobicity. The proteins were adsorbed onto an uncharged matrix (synthetic polymer, TSK Phenyl 5-PW column) carrying hydrophobic groups in the presence of salts. IEC and HIC procedures were previously evaluated as the prefractionation steps for human fetal brain and Haemophilus influenza, respectively [8, 28]. IEC allowed identification of 543 proteins in human fetal brain [8] and it was demonstrated that the HIC approach enabled enrichment of low-copy-number gene products [28–30]. In this study, many proteins were repeatedly identified in both IEC and HIC fractions, and a total of 420 and 47 proteins were exclusively identified in www.electrophoresis-journal.com 2810 J.-H. Shin et al. Electrophoresis 2006, 27, 2799–2813 IEC and HIC, respectively (Fig. 2e). Few proteins in HIC were detected in the flow through and wash fractions, whereas proteins of flow through in IEC were shown in the elution profile and 2-DE gels. High-abundant proteins such as actin or albumin were successfully confined in certain fractions of IEC thus allowing visualization of LAPs. 3.3 Computational annotation of HPs We here experimentally show the presence of 29 HPs (29 of 789, 3.7%) that have never been reported so far at the protein level and were simply predicted from nucleic acid structure. Conserved sequence motifs may provide a clue for putative function (sequence motif-to-function approach) [31] and HPs showing high similarity to functionally known proteins are possibly isoforms or homologs with known function. A close look at 29 HPs revealed that 23 HPs contained at least some functional sequence motifs based primarily on experimental characterization of their homologs in E. coli and other organisms by SMART, CDD, COG, and Pfam while others were somewhat vague, reflecting the presence of subtle sequence similarities to previously characterized proteins (Table 2). According to gene ontology (GO) data and domains, eight HPs (27.6%, 8 of 29) could be assigned to enzymatic proteins and HPs involved in other functional classes were annotated by computational analysis; signaling proteins (2 of 29, 17.2%), cytoskeleton proteins (3 of 29, 9.4%), transcriptional/translational factors (4 of 29, 5.8%), detoxification proteins (4 of 29, 2.5%), proteolysis components (2 of 29, 6.0%), and unpredictable proteins without conserved domain (6 of 29, 17.2%) (Fig. 4a). In order to reliably verify the putative function of six HPs harboring no putative conserved domains, we investigated putative PPIs of HPs at DIP database (http://dip.doe-mbi. ucla.edu/) constructed by experimental evidences. A hypothetical cysteine-rich region, containing protein full insert sequence (Q8BZE2), was only available in DIP database and showed clear orthologs, e.g. Claret segregational protein (CG7831-PA) (GenBank, gi:17136354; Swiss-Prot, NCD_DROME (D. melanogaster)) (Table 3). We might predict that Kinesin family member C5A (Q6PG90) (Mus musculus) (37% identity to NCD_DROME, 204/541 amino acid) could be the binding partners of this HP in mouse system by sequence similarity. Unfortunately, no available database and programs enable to predict the 3-D structure, and binding sites due to the absence of conserved sequence alignment. The learning, observing, and outputting protein patterns program (LOOPP, http://www.tc.cornell.edu/ CBIO/loopp) only showed the predicted secondary structure of hypothetical cystein-rich regions containing protein (Q8BZE2) by sequence-structure alignment. Seventeen helices and 21 coil structures were confidently designed by potential optimization (Fig. 4b). In the remaining structures, neither functional motif searches nor PPI studies could give certain clues for putative function of HPs. Searches against STRING, database integrating interaction data from genomic context, high-throughput experiments, conserved coexpression, and previous knowledge revealed that AD039 (Q8BWR2), FKSG27, hypothetical S-adenosyl-L-methionine-dependent methyltransferases structure containing protein, full insert sequence (Q9DCS2) and similar to hypothetical 34.0 kDa protein (Q8BHQ7) confidently showed indirect functional, not physical, associations with sodium/potassium-transporting ATPase beta-2 chain (P14415, score = 0.414), monocarboxylate transporter 4 (P57787, score = 0.743), NADH-ubiquinone oxidoreductase MW (O35683, score = 0.433) and protein C21orf7 (P57077, score = 0.579) by conserved coexpression data, respectively (data not shown). Table 3. Putative interacting partners of HPs harboring no putative conserved domain Protein name Putative interactors Orthologs of HP (Species, E-value, GenBank) (DIP Nodea), NCBI) Hypothetical cysteine-rich region containing protein, full insert sequence (Q8BZE2) CG14043-PA ORF [Drosophila melanogaster] (2e-27, gi:24581791) Claret segregational protein (CG7831-PA) (DIP:21302N, gi:17136354, NCD_DROME Mouse orthologs of putative interactors (Identityb), Expect) Experimentc) (Reference) Two hybrid test [37] Identities = 204/541 (37%), Expect = 1e-82 Kinesin family member C5A [Kifc5a] (M. musculus (mouse)) Q6PG90 a) A unique identifier for identifying each protein participating in a DIP interaction (DIP database, Database Interaction Protein, http://dip.doe-mbi.ucla.edu/). b) Comparison with similarity to known function. c) Detail references are available in DIP database with DIP node. © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.electrophoresis-journal.com Electrophoresis 2006, 27, 2799–2813 Proteomics and 2-DE 2811 Figure 4. (a) Classification of HPs with putative function annotated by computational analysis. (b) Predicted secondary structure of hypothetical cystein-rich region containing protein (Q8BZE2) by searching the LOOPP program alignment. Each line indicates the query sequence, predicted secondary structure (H, helix; E, b-strand; C, coil), and confidence level (scale from 3 [low confidence] to 9 [high confidence]), respectively. 4 Discussion A large series of proteins were identified in mouse brain and form the so far largest published mouse brain protein reference database. Other major findings include a comprehensive computation of observed pIs, a valuable analytical basis to find proteins as the theoretical, predicted pIs cannot be used to determine the position of a protein in a 2-DE gel. This is not in contradiction to the correlation of theoretical with observed pI for the individual gene products, the many protein expression forms resulting in the molecular diversity of the proteome are representing with different pIs and molecular weights. In addition, we experimentally demonstrated the presence of 29 HPs that © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim have never been reported at the protein level and were so far simply predicted from nucleic acid structure. These can now be studied experimentally as they have been shown to really exist. Klose et al. [32] recently resolved 8767 protein spots by large-gel 2-DE, mapped 665 proteins genetically and identified 466 proteins by MALDI-TOF MS. The current study extends this work by separating proteins from the cytosolic fraction by two different chromatographic prefractionations prior to gel-based separation. Whereas Klose et al. [32] used M. musculus strain C57BL/6 (B6) and Mus spretus (SPR), we used FVB/N mice, which is a strain widely used in transgenic mice thus offering a syswww.electrophoresis-journal.com 2812 J.-H. Shin et al. tem suitable for most transgenic experiments and subsequent genetic analyses. Herein, all 42 2-DE gels were used for identification and there is a strong overlap with data from Klose et al. [32]. To date, the Swiss-Prot and TrEMBL database, for example, indicate 9575 and 39355 proteins for M. musculus. Although this number reflects considerable redundancy due to inclusion of many fragments, splicing variants, precursors, and predicted gene products, 2-DE-based protein identification is still far from generating whole proteomes. The aim to generate a reference brain protein database that has to be continuously upgraded by further improvements in proteomic technologies and providing information for the analysis of proteins was met however. Electrophoresis 2006, 27, 2799–2813 5 References [1] Mouse Genome Sequencing Consortium, Nature 2002, 420, 520–562. [2] Bult, C. J., Blake, J. A., Richardson, J. E., Kadin, J. A., Eppig, J. T., Mouse Genome Database Group, Nucleic Acids Res. 2004, 32, D476–D481. [3] Bonaldo, M. F., Bair, T. B., Scheetz, T. E., Snir, E. et al., Genome Res. 2004, 10B, 2053–2063. [4] Fields, S., Science 2001, 291, 1221–1224. [5] Lubec, G., Krapfenbauer, K., Fountoulakis, M., Prog. Neurobiol. 2003, 69, 193–211. [6] Gorg, A., Weiss, W., Dunn, M. J., Proteomics 2004, 4, 3665– 3685. [7] Krapfenbauer, K., Fountoulakis, M., Lubec, G., Electrophoresis 2003, 24, 1847–1870. [8] Shin, J. H., Krapfenbauer, K., Lubec, G., Electrophoresis 2005, 26, 2759–2778. [9] Josic, D., Brown, M. K., Huang, F., Callanan, H. et al., Electrophoresis 2005, 26, 2809–2822. Recently, a number of studies have utilized the 2-DE based proteomics approach to examine the mammalian brain proteome. Several investigations have focused on the identification of major proteins in normal brain tissue, and on the construction of reference databases that catalog proteins expressed in the whole brain or brain sections: Reference brain proteome databases have been established for the human fetal brain [8], whole rat brain [7], human/mouse/rat hippocampus [33–35], and mouse neurons and astrocytes [36]. Although numerous proteins can be analyzed using 2DE based methodology, there are still a series of limitations for its use. The proteomic technique used herein only allowed identification of highly soluble, hydrophilic proteins, and only a limited number of very acidic or basic proteins were revealed. Identification of 789 individual proteins, however, represents the largest brain protein reference database published so far that forms the basis for design of brain protein expression studies. The molecular diversity is shown as well as a protein chemical analytical tool for fair analysis independent of antibody availability and specificity. Results from this communication may represent the basis for studies on proteins identified herein and complement immunochemical studies in future protocols, warranting protein chemical demonstration of proteins. We are continuing work on the proteome of the mouse brain in our laboratory by extending chemical prefractionation protocols and advanced instrumentation to further increase this reference database. [10] Righetti, P. G., Castagna, A., Antonioli, P., Boschetti, E., Electrophoresis 2005, 26, 297–319. [11] Lescuyer, P., Hochstrasser, D. F., Sanchez, J. C., Electrophoresis 2004, 25, 1125–1135. [12] Birch, R. M., O’Byrne, C., Booth, I. R., Cash, P., Proteomics 2003, 3, 764–776. [13] van den Bergh, G., Clerens, S., Vandesande, F., Arckens, L., Electrophoresis 2003, 24, 1471–1481. [14] Badock, V., Steinhusen, U., Bommert, K., Otto, A., Electrophoresis 2001, 22, 2856–2864. [15] Bradford, M. M., Anal. Biochem. 1976, 72, 248–254. [16] Fountoulakis, M., Langen, H., Anal. Biochem. 1997, 250, 153–156. [17] Berndt, P., Hobohm, U., Langen, H., Electrophoresis 1999, 20, 3521–3526. [18] Press, W. M., Teukolsky, S. A., Vetterling, W. T., Flannery, B. P., Scientist 1986, 1, 23. [19] Rockwood, S. L., van Orden, R. D., Smith, G., Anal. Chem. 1995, 67, 2699–2704. [20] Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M. C. et al., Nucleic Acids Res. 2003, 31, 365–370. [21] Marchler-Bauer, A., Anderson, J. B., DeWeese-Scott, C., Fedorova, N. D. et al., Nucleic Acids Res. 2003, 31, 383– 387. [22] Tatusov, R. L., Galperin, M. Y., Natale, D. A., Koonin, E. V., Nucleic Acids Res. 2000, 28, 33–36. [23] Schultz, J., Copley, R. R., Doerks, T., Ponting, C. P., Bork, P., Nucleic Acids Res. 2000, 28, 231–234. [24] Bateman, A., Birney, E., Durbin, R., Eddy, S. R. et al., Nucleic Acids Res. 2000, 28, 263–266. [25] Wheeler, D. L., Barrett, T., Benson, D. A., Bryant, S. H. et al., Nucleic Acids Res. 2005, 33, D39–D45. [26] von Mering, C., Jensen, L. J., Snel, B., Hooper, S. D. et al., Nucleic Acids Res. 2005, 33, D433–D437. [27] Xenarios, I., Salwinski, L., Duan, X. J., Higney, P. et al., Nucleic Acids Res. 2002, 30, 303–305. We appreciate the contribution of the Verein zur Durchführung der Wissenschaftlichen Forschung auf dem Gebiet der Neonatologie und Kinderintensivmedizin “Unser Kind.” © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim [28] Fountoulakis, M., Takacs, M. F., Takacs, B., J. Chromatogr. 1999, A833, 157–168. [29] Foyn Bruun, C., J. Chromatogr. B 2003, 790, 355–363. www.electrophoresis-journal.com Electrophoresis 2006, 27, 2799–2813 [30] Langen, H., Takacs, B., Evers, S., Berndt, P. et al., Electrophoresis 2000, 21, 411–429. [31] Koonin, E. V., Galperin, M. Y., Sequence-Evolution-Function. Computational Approaches in Comparative Genomics, Kluwer Academic, Boston, MA 2002. [32] Klose, J., Nock, C., Herrmann, M., Stuhler, K. et al., Nat. Genet. 2002, 30, 385–393. [33] Yang, J. W., Czech, T., Lubec, G., Electrophoresis 2004, 25, 1169–1174. © 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim Proteomics and 2-DE 2813 [34] Shin, J. -H., London, J., Le Pecheur, M., Weitzdoerfer, R. et al., Neurochem. Int. 2005, 46, 641–653. [35] Fountoulakis, M., Tsangaris, G. T., Maris, A., Lubec, G., J. Chromatogr. B 2005, 819, 115–129. [36] Yang, J. W., Rodrigo, R., Felipo, V., Lubec, G., J. Proteome Res. 2005, 4, 768–788. [37] Giot, L., Bader, J. S., Brouwer, C., Chaudhuri, A. et al., Science 2003, 302, 1727–1736. www.electrophoresis-journal.com