Download alborz-final

BME1450, Nov 2004 1 Protein Microarrays: Quantitative, Temporal Proteomic Data for Systems Biology Alborz Mahdavi  Abstract: Considerable advances have been made in the area of cell signaling models, but in order for these models to be biologically relevant, precise, time dependant, quantitative and high-throughput proteomic data is required. Several methods to detect proteins such as non-isotopic labeling based techniques, quantum dots, mass spectrometry, surface plasmon resonance and colourimetric resonant reflection are considered. The principle behind each technique as well as its limitations is discussed. Utilization of surface plasmon resonance in conjunction with selfassembling, antibody-free probes is proposed as a platform for protein microarrays. It is further suggested that this system can be used for temporal analysis of protein phosphorylation events as well as determination of protein binding constants; two key pieces of information that are essential for the development of useful cell signaling models for systems biology. Index Terms—Microarrays, Surface Plasmon Resonance, Selfassembled, Antibody-free, Proteomic, Systems Biology I. INTRODUCTION I mportance of quantitative protein detection is one of the realizations of the post-genomic era. Predictions of such cell signaling models as Bayesian networks, differentiation equation based models, Boolean networks and protein networks are only as good as the data that is used to generate the model. Currently technological restrictions limit these modelling approaches because the quantification of protein changes such as phosphorylation and dissociation rates are cumbersome and imprecise. Hence development of accurate protein detection technologies is needed. Protein detection will be useful not only for determination of signaling events in cells, but it has applications in food engineering, environmental sampling, quality control, bio-agent detection as well as screening for diseases such as HIV and Hepatitis. Solving this problem will not only benefit research, it will have immediate impact in industry. Two criteria have to be met for any protein detection technology to be effective. First, the protein must be physical tagged using a molecular marker, and second the protein-probe complex must be detected with high precision. These two steps may be combined, however it has to be feasible to make the system high-throughput so that a large number of samples are screened for a large number of different proteins. In order to consider the problems associated with protein detection, it is informative to reflect on how DNA microarrays circumvent similar problems. II. DIFFICULTIES IN PROTEIN DETECTION A. DNA Microarrays DNA microarrays consist of a surface onto which DNA is synthesized using photolithography techniques. Four masks corresponding to nucleotides AGTC are used in photolithography to make probe strands of about 25 nucleotides; a total of more than 500 000 probe locations can be made on one array [11]. Since the probes are made in parallel, each probe spot will contain many identical single strands of DNA. The mRNA sample that is to be analyzed is first amplified in a process called polymerase chain reaction (PCR) using florescent nucleotides, which replicates the DNA strands many times so that a high concentration of sample cDNA is available [7]. This sample cDNA is then washed over the surface of the microarray and complementary strands attach. Since florescent nucleotides were used in the sample, the matching pairs can be detected using laser [11]. For a more detailed explanation of the techniques refer to [11]. Several factors contributed to the success of DNA micro arrays. The photolithography instruments were already available from electronics manufacturers, the sample could be easily amplified by PCR, the amplified sample was already fluorescent and would bind specifically to its complementary strand. The problem with protein microarrays is that each step of this process has to either be modified or resolved. B. Problem of Tagging Proteins Robust tagging of proteins is required for immobilization of protein of interest at the predefined spot on the array for detection. Antibodies are the best example of highly specific protein tags [7]. The specificity of antibodies and their high binding constant (affinity for their protein) makes them ideal for detection of proteins, provided that the antibody is available [12]. However, it is not cost-effective to purchase 100000 antibodies for a typical protein microarray. Furthermore, if such an array is built, it will be very difficult to store the array because the antibody proteins will denature over time [5]. It is also difficult and time-consuming to produce purified antibodies for proteins that are not characterized and are newly found. Thus, what is needed is essentially a replacement for natural antibodies. The tagging agent must be highly specific, easy to produce on the array and have the ability to attach to the surface so as to hold the protein for detection. Furthermore the tags must be easily stored and modular, such that many different tags are available, each corresponding to a specific protein. DNA BME1450, Nov 2004 2 microarrays do not suffer to the same extend from this problem, because the probes are already linked to surface and highly specific for their complementary strand. C. Problem of Labeling and Detection Detection of the presence of a certain protein, even if it is immobilized at the correct spot on the array is not easy. The problem arises from the fact that if the original concentration of the protein in the sample is very low, the detection of such low concentration is very difficult. The equation governing the amount of protein immobilized to the surface based on first order kinetics is: Ceq  RT L0 L0  K D for a different length of time. The cells are lysed, their contents mixed in equal ratios, phosphorylated proteins separated, and MS is performed on the sample. The results are as depicted in figure 1. Using this approach, protein phosphorylation events can be studied. (1) Where RT is the amount of tag on the surface, L0 is the amount of protein in the sample and KD is dissociation constant [9]. As L0 is often low, the amount of bound protein is highly affected by differences in dissociation rate constant and amount of tags present on the surface. For a constant surface tag density; differences in binding affinity will dominate the strength of detected signal and produce non-uniform noise [12]. Furthermore, whereas in DNA microarrays the amplified sample DNA strands are already fluorescent, in protein microarrays the proteins are not fluorescent and hence another highly specific tagging step is required to be able to detect the proteins. III. APPROACHES TO LABEL AND DETECT PROTEINS A. Mass Spectrometry Protein Mass Spectrometry (MS) involves charging proteins in the presence of a very strong electric field. Depending on the mass to charge ratio of the protein, time of arrival at the electrode is different for different proteins as they travel through the field [7]. Hence exact determination of proteins in the sample is possible through MS. Furthermore proteolytic fragmentation of the proteins allows for determination of the constituents parts. The most recent and widely used MS technique for protein detection is Matrix Assisted Deposition/Ionization Mass Spectrometry (MALDI MS) [10]. In MALDI the protein is dissolved in low molar ratio in an UV absorbing organic acid (the matrix), which is then vaporized using a strong UV laser. UV laser transfers some of the protons of the matrix to the protein and charges the protein whilst vaporizing the matrix [10]. This approach is a softer alternative to normal MS, and can accurately determine protein constituents. Recently MS has been used to perform temporal analysis of phosphorylation events in cells [6]. Cell populations from which peptides are extracted are first grown in medium containing different isotopes incorporated into the amino acid Arginine. For example, 13C14N and 12C14N and 13C15N Arginine are used where the number depicts the mass number of carbon or nitrogen [6]. Thereafter the cells are subjected to different duration of signal activation through introduction of cytokine or soluble growth factors. Hence each different isotope combination contains a different amount and composition of phosphorylated protein because it is stimulated Fig1. Two different cell populations can be distinguished by a 4 Dalton shift which results from presence of heaver isotopes in second cell medium. The heavier isotope cell population is stimulated for 10 minutes and an up regulation of phosphorylated peptides is observed from the 5-minute time point from increase in relative intensity. MS would be the strongest competitor of any protein microarray technology due to its high-throughput nature; however MS has its limits. Although MS can theoretically be used to find all the proteins in a sample, sophisticated purification through affinity columns is required. Cells must be grown in defined mediums and samples from endogenous tissues cannot be used in the above approach. Furthermore, use of isotopes for temporal analysis limits the number of time points that are possible. MS is highly statistical and often done in conjunction with another protein separation technique because presence of peaks in the MS spectra must often be verified [10]. For brevity, these techniques will not be discussed here and reader is urged to consult source [2] for more details. B. Non-isotopic Labeling and Quantum Dots Non-isotopic labels can be categorized into four groups, organic reporters that are enzyme activated, luminescent, fluorescent and electro-active labels [7]. These labeling techniques are based on emission of light, which is detected using a photo-sensor. The amount of emitted light is correlated with the amount of labeled protein, in the case of protein detection. Fluorescent labels in particular are excited at one wavelength and emit at a different wavelength. For example Argon laser is used to activate the fluorescent dye FTIC at 488 nm, which emits light at about 530 nm. Depending on the precision of the detecting instrument, currently about up to 8 different dyes can be detected at once. Quantum Dots (QD) are ultra-sensitive fluorescent dyes that have three main advantages over other fluorescent dyes. The dots size can be used to tune the emission wavelength for about 20 different colours, the excitation spectra is much wider and the emission spectra is about one third as wide as normal fluorescent dyes [4]. QDs are about 20 times brighter and 100 times more stable [4]. However, even with the advent of QD technology, there are significant problems associated with these labeling systems for use in protein microarrays. The labels will still have to be attached to the tagged proteins on the surface of the array and cannot be incorporated into the protein. The number BME1450, Nov 2004 of possible colours is limited for high-throughput detection, and significant background noise is present. Furthermore, although QD’s do not suffer from photo bleaching, other fluorescent dyes do, and even QD’s are not useful for longterm continuous detection because protein denaturing becomes a problem. As such more novel techniques must be used for detection of proteins on the surface of the microarray. C. Surface Plasmon Resonance Surface Plasmon Resonance (SPR) sensors are optical sensors that make use of changes in electromagnetic properties of sensor surface for detection [3]. This is achieved by producing a Surface Plasma Wave (SPW) at the interface between a metal and a dielectric by shining a light that has the same component wave vector as the SPW [3]. SPW propagates along the interface and behaves like quasi-free electron plasma, attenuating exponentially from the interface and is extremely sensitive to the complex dielectric constant of the medium and the interface [3]. The propagation constant for SPW is solely based in the electric constant and the wavelength of light that is used to induce it. SPR sensors use a prism of high refractive index to achieve a total internal reflection at the interface with a metal, giving rise to an evanescent field wave, which penetrates the metal surface into the medium containing the analyte [3]. Once the evanescent field wave constant matches that of SPW, changes in dielectric constant of the surface, for example by addition of protein on the surface, will dramatically change the SPW. SPW interacts with light so that changes in SPW can be correlated with changes in spectral distribution, polarization, amplitude and phase of reflected light. These can be detected as a measure of presence of anlyte on the surface. The detection method is indirect and label-free. The setup for a SPR sensor is as follows: Fig2. Schematic representation of the SPR optical sensor shows specific attachment of protein to its corresponding tag and change in evanescent field wave. Prism with a high refractive index is used to direct light. There are several advantages to SPR sensors including realtime and continuous detection of proteins as well as the ability to detect proteins without the need for labeling. Furthermore, since the tags are not effected by detection, the sensor can be used for extended periods of time and problems such as photo bleaching are not encountered [3]. In cases where labeling of proteins will alter their structure, SPR offers an alternative detection technique. Real time detection of proteins using SPR only became available recently, and advances in micro-fluidics and photonics have paved the way for progress of this technology. 3 D. Colourimetric Resonant Reflection Colourimetric Resonant Reflection (CRR) makes use of changes in the refractive index of surfaces caused by presence of analyte as a method of optical detection [1]. Although CRR is based on similar principles as SPR, the sensor setup and capabilities are different. In CRR light passes through the analyte solution and hits a specially designed grating at the bottom of the surface that reflects only a single frequency [1]. Presence of proteins on the surface causes a change in the refractive index of the surface. The change in refractive index alters the path of incident light enough so that the reflected light has a different frequency [1]. Hence a shift in frequency is correlated to presence of protein on the surface. The distant resolution of CRR is similar to SPR and changes as small as 1 nm of surface height can be detected by both systems [1]. This size range is small enough to be able to precisely detect protein deposition or binding on the surface. The special grating sensor is designed as depicted in the following diagram: Fig3. Schematic representation of the CRR sensor shows grating which is produced using photolithography with dimensions of similar magnitude as wavelength of visible light. Both the incident and reflected light are perpendicular to the surface. The grating size is a fraction of light wavelength and as such only a single frequency is reflected back. The advantage of CRR detection is that it can be adapted to highthroughput screening. For example a fiber optic can activate and scan many spots in a fraction of second, effectively producing real-time detection of protein binding [1]. Furthermore, since a special prism is not required for detection, sensor fabrication is easier. Detection and labeling problems are simultaneously solved by approaches such as SPR and CRR where label free, realtime and precise detection is possible. Furthermore, commercialized versions of these sensors are becoming available. Biacore, British Windsor Scientific, Nippon Laser and Electronic Laboratory, Texas Instruments, Coring and SRU biosystems all have variation of these sensors, and are developing platforms for high-throughput screening [3]. However, these approaches rely strongly on the presence of highly specific tags so that only the protein of interest is bound to the surface. As such in order to develop high-throughput protein microarrays, a reliable tagging system is required. IV. PROTEIN TAGS AND SOLVING THE PROBLEM OF DETECTION Ideal tagging involves minimum cross binding and maximum specific binding. For brevity, the most recent antibody free system, which is easily adaptable to detection by surface plasma resonance, is considered. Nucleic Acid BME1450, Nov 2004 Programmable Protein Array (NAPPA) involves introduction of a protein-coding DNA segment on the surface of a microarray and subsequent cell-free translation of the DNA to the correct protein through use of enzymes such as T7 polymerase [5]. The translated protein includes a terminal tag peptide, for which an antibody is present on the surface of the array [5]. Hence the translated proteins are immobilized on the surface of the array because the terminal peptide binds to the antibody on the surface. The same antibody is used throughout the array as it functions solely to hold the proteins on the surface of the array. Using this method a library of different proteins can be produced on the surface of the microarrays. This method has several advantages. First, the same photolithography techniques that are used to produce DNA microarrays can in principle be used to make NAPPA arrays [5]. Second, the NAPPA array can be stored without the problem of protein denaturing because the un-translated array only contains DNA and only through introduction of enzymes does it become active [5]. Third, the production of NAPPA array is financially feasible because most of the technologies that make it possible are already commercially available. V. MAKING THE PROTEIN MICROARRAY Genes coding for antibodies in humans have a variable region VL of less than 400 base pairs that codes for the attachment site of the antibody to a protein, giving rise to its high specificity [7]. It is possible, although difficult, to produce this DNA segment on a NAPPA array. This can be achieved by producing a linear stretch of DNA coding for the VL region followed by the segment coding for the attachment peptide using photolithography. The sequences for the VL regions of most typical antibodies can be found in the Kabat or NCBI database [8]. This linear sequence is then spliced using enzymes and an exon is produced. Using cell-free translation as is done in NAPPA arrays; protein of interest is produced on the array. The protein is immobilized on the surface of the array through attachment of its terminal peptide to the antibody that is present on the array surface as in NAPPA. If any step of this process does not take place successfully then the translated protein will not bind to the surface antibody because the terminal peptide is the last segment to be translated. Hence it is possible to produce highly specific tags in a high-throughput manner using a modified NAPPA array. But the important difference between the NAPPA array and the proposed arrays is that SPR is used instead of fluorescent tags to detect the presence of proteins. There are already highthroughput versions of SPR being commercialized, in particular by Biacore. SPR detection provides several advantages over other techniques including CRR in this case. First, detection can be real-time; this is useful for studies involving dissociation rate constants and non-specific binding. Second, SPR can be used to detect not only antibody attachment, but also attachment of cytokines to receptors or protein-protein interactions. Third SPR can be used to find rate constants. For example, if the concentration of protein of interest is very low, it will take a long time for equilibrium to establish. A plot of protein on the surface vs. time will plateau as equilibrium is reached, the longer the time, the lower the 4 concentration of the protein. Furthermore, the initial rate of change of this plot is directly proportional to the association rate constant, and the rate of change when the protein in no longer present in the solution is directly proportional to dissociation rate constant. Knowing the concentration of the protein from equilibrium, one can determine an effective dissociation rate constant (Kd) for either protein-protein, receptor-cytokine or protein-tag interaction. Furthermore, if the surface tag is designed so that only phosphorylated proteins attach, then rate of phosphorylation as a function of time can be determined. As such, this modified NAPPA/SPR protein microarray will provide quantitative information regarding protein-protein interactions. This information is crucial for the development of chemical kinetic based models. Furthermore, interactions that are predicted using models such as Bayesian networks can be verified using this protein microarray. VI. CONCLUSION The proposed NAPPA/SPR protein microarray is a highthroughput, quantitative means to study protein-protein interactions and is useful for the development and verification of cell signalling models. With the rapid parallel progress that is taking place in high-throughput technologies and cell signaling models, the proposed protein microarray will be a central tool for systems biology in the near future. REFERENCES Brain Cunningham, Peter Li, Bo Lin, Jane Pepper, “Colorimetric resonant reflection as direct biochemical assay technique”, Sensors and Actuators B 81, Elsevier, 2002, pp. 316-328. [2] Steven T. Elliott, David G. Crider, Christopher P. Graham, Kenneth R. Boheler and Jennifer E. Van Eyk, “Two-dimensional gel electrophoresis database of cultured murine R1 embryonic stem cells”, Proteomics 2004, Wiley-VCH, 10.1002 [3] Jiri Homola, “Present and future of surface plasmon resonance biosensors”, Anal Bioanalytical Chem., 2003, 377:528-539 [4] Warren C. W. Chan and Shuming Nie, “Quantum Dot Bioconjugates for Ultrasensitive Nonisotopic Detection”, Science, Vol 281, 25 September 1998, pp 2016-2018. [5] Niroshan Ramachandran, Eugenie Hainsworth, Bhupinder Bhullar, Smauel Eisenstein, Benjamin Rosen, Albert Y. Lau, Johannes C. Walter, Joshua LaBaer, “Self-Assembling Protein Microarrays”, Science, Vol 305, 2 July 2004, pp. 86-90. [6] Blagoy Blagoev, Shao-En Ong, Irina Kratchmarova and Mattias Mann, “Temporal analysis of phosphotyrosine-dependent signaling networks by quantitative proteomics”, Nature Biotechnology, Vol 22, 9, 2004 pp. 1139-1145 [7] Harvey Lodish, Arnold Berk, S. Lawrence Zipursky, Paul Matsudaira, David Baltimore and James Darnell, “Molecular Cell Biology”, Freeman and company, New York, 4th Edition, 2000 [8] http://www.ncbi.nlm.nih.gov or http://www.kabatdatabase.com [9] D. Lauffenburger and J. Linderman, “Receptors, Models for binding, trafficking and signalling”, Oxford University Press, New York, 1993 [10] Henno Steen, Matthias Mann, “The ABC’s (and XYZ’s) of peptide sequencing”, Nature, Vol. 5, 2004, pp.700-711 [11] Michaeal J. Heller, “DNA Microarray Technology, Devices, Systems and Applications”, Annu. Rev. of Biomed. Engineering, 2002, Vol. 4, pp.129-153 [12] Gavin MacBeath and S. Schreiber, “Printing Proteins as Microarrays for High-Throughput Function Determination”, Science, Vol. 289, 2000 pp. 1760-1763 [1]

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download alborz-final