* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Research Essay
Signal transduction wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
Genetic code wikipedia , lookup
Paracrine signalling wikipedia , lookup
Biochemistry wikipedia , lookup
Point mutation wikipedia , lookup
Gene expression wikipedia , lookup
Magnesium transporter wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Expression vector wikipedia , lookup
Metalloprotein wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Structural alignment wikipedia , lookup
Interactome wikipedia , lookup
Western blot wikipedia , lookup
Protein purification wikipedia , lookup
Proteolysis wikipedia , lookup
Protein Structure Prediction and Determination Protein Structure and Prediction Mariana Vasquez University of Texas at El Paso UNIV 1301 Professor Garcia 1 Protein Structure Prediction and Determination 2 Abstract Proteins are catalysts, transporters, antibodies, building blocks, stores, and signals. They play roles in regulatory mechanisms as hormones, speed up reactions as catalysts, transport nutrients and oxygen in the blood, regenerate tissues and make growth possible, and protect the body against antigens. Thus proteins are very important. Currently, the most accurate way to determine protein structure is by X-ray crystallography on well-grown crystals. Due to the process of obtaining good crystals being complicated and time consuming, researchers are coming up with new ways to predict protein secondary structure, tertiary structure, and ProteinProtein Interactions. The methods may be based on having enough similarities while aligned with other protein, or physics-based approach about interactions between environment and the protein's make-up (Guo, 2007, p.4). If a very accurate algorithm is discovered, the process of predicting 3-D structure, the structure level that determines function of protein, will be revolutionized as well as chemical medicine. Protein Structure Prediction and Determination 3 Protein Structure Prediction and Determination Proteins catalyze reactions, transport oxygen and nutrients, transport signals (hormones), provide immune responses, storage, and provide regeneration. They are composed of amino acids and have four levels of structure, all stemming from the amino acid sequence, primary structure. We know about protein structures due to experimental data obtained by x-ray crystallography. It's a method that needs proteins to form crystals to diffract x-rays. Knowing the structure of a protein is very helpful to researchers in finding solutions for ailments. A protein may not be binding correctly, knowing about the structure of the working protein would be helpful in determining what the defective protein has or lacks, and how to eliminate or make up for that difference. Today, X-ray crystallography is the only method of obtaining a 100% exact protein structure, researchers are now trying to find more convenient ways to determine protein structure. X-rays are used in crystallography instead of regular light due to their tiny wavelengths which are ideal for the tiny angstrom lengths between molecules and atoms. Crystals are repetitive arrangements of molecules packed together with solvent filling the gaps, which makes them very fragile (Ilari, 2008, p. 64). This repetitive patterns, crystals, are used so as to amplify the diffraction of one of the cells enough to be measurable. Crystals are best formed with purified substances (Ilari,2008, p.63). "The growth of crystals starts from the supersaturated solution of the macromolecule and evolves toward a thermodynamically stable state in which protein's partitioned between a solid phase and the solution"(Ilari, 2008, p. 63). The growing of crystals of a protein suitable for X-ray crystallographic analysis is the main challenge in crystallographic screening (Berdini & Congreve, 2007, p.99). Protein Structure Prediction and Determination 4 According to Ilari, the purity of the protein should be determined beforehand; it must be at least 90% pure, if not, it should be purified further through electrophoresis (2008, p.64). Then it should be dissolved in a proper solvent; each protein or substance has different environmental needs such as pH (Ilari, 2008, p.64). Once the solution is brought to super saturation, aggregates are formed, which are the nuclei for crystal formation (Ilari, 2008, p.64). Now crystals can form, and we can measure the spots on the image by computers, making life easier. X-ray crystallography can also be used to determine the exact binding mode of fragments and thus enable researchers to make fragments into selective lead compounds that could be possible pharmaceuticals (Berdini & Congreve, 2007, p. 99). Data gathered on many proteins, allows us to have some knowledge to predict what a protein is meant to do without having to grow those crystals, as obtaining well-suited crystals is hard and may not be available everywhere. The important experimental data may be stored in Protein Databases (PDB), which allows easy access to protein information such as the 3-D structure obtained via crystallography, the organism from where it comes from, the sequence, which can be downloaded as a Fasta text file, the chemical formula, etc. Commonly used PDBs are RCSB and NCBI, anyone who wished to use these would just have to "Google" them. Protein BLAST (Basic Local Alignment Search Tool) can be used to find similar proteins. It determines bit-scores(S'= (λS-lnK)/ln2), which are determined from alignment of protein amino acid sequences(Madden, 2003). These scores are a measure of how good the alignment is (Madden, 2003). It uses the substitution matrix BLOSUM 62 to determine similarity between proteins and whether or not similarities were due to chance (Madden, 2003). Protein Structure Prediction and Determination 5 BLOSUM 62 stands for BLOCKS Substitution Matrix. It has a 62 that means that all sequences are at no more than 62% similar (Fassler & Cooper, 2011). Some may be 90% similar, but all the sequences of interest were only 62% similar at most. It uses the bit-scores obtained by subtracting the gap scores (a fixed amount to prevent so many gaps being added) from the alignment score (Fassler & Cooper, 2011). Substitution matrices such as BLOSUM 62 are tailored in a specific evolutionary distance (Fassler & Cooper, 2011). The "E value" tells how many matches can be expected to happen merely by chance (Madden, 2003). The closer the Evalue, the better. A similar primary structure (sequence) could mean a similar function or structure. Gaps are added to increase similarities such as catsup and casstvup would become ca—tsup and casstvup. However, matches added by gaps can often increase matches by chance. Just because some sequence also has an ‘e’ doesn't mean it's similar. There’s a huge chance that a protein will also have an ‘e’, proteins are made of the same 20 amino acids! Thus, single amino acid similarities are quite insignificant. For example, a sequence FVVVEGG and AAGGEGG are more similar than FVEFGFG and F_E_G_G because they have the three letter “word” instead of single nucleotides that needed gaps to be matched (Madden, 2003). Predicting 3-D structure is essential to understand the functions and mechanisms of a protein (Guo & Xu, 2007, p.4). It is the current ‘holy grail’ of bioinformatics (Guo & Xu, 2007, p.3). Computational protein folding employs finding the lowest free-energy structure for an amino acid sequence, based on the thermodynamic hypothesis formulated by Anfinsen that proteins will arrange at free energy minimum, through searching the exceedingly large conformational space of the protein (Guo & Xu, 2007, p.4). Protein Structure Prediction and Determination 6 There are three approaches for protein structure prediction: ab initio, protein threading, and homology modeling (Guo & Xu, 2007, p. 5). Ab initio predictions are solely based on physics, homology predictions are made based on sequence alignments ("matches" that mean they're similar, as mentioned earlier). Protein threading uses sequence similarity information, when it exists, and structural fitness information between the query and template protein (Guo & Xu, 2007, p.5). Threading is based on the idea that there's only a limited number of folds a protein can have, so proteins can be compared to other proteins with the same folds. In order to decrease limitations, researchers tend to use a mixture of the three methods (Guo & Xu, 2007, p.5).Threading uses "templates" which are core folds that take into account the interactions of amino acids and their characteristics in a given environment. Protein-Protein Interactions (PPI) are important in determining cell regulatory mechanisms (Zhang et. al., 2012, p.556). PPI software predictions that use structural information, according to Zhang et al., are far more accurate than those based on nonstructural evidence (2012, p.556). The PrePPI algorithm is very successful in identifying unexpected PPI's in possibly important interactions because it combines homology with close and remote geometric relationships between proteins (Zhang et. al., 2012, p.556). If using both structure information and homology as in the PrePPI algorithm, why until now have both been combined? The ratio of proteins whose structures were determined experimentally via X-ray crystallography to those unknown is very low. But according to Zhang et al., using geometric relationships between groups of secondary structure elements is best (2012, p.556). First the proteins were aligned with each other to find possible similarities that may suggest a specific complex, "structural neighbors". Interaction model is created, and is then evaluated using empirical scores (Zhang et al., 2012, p.556) Protein Structure Prediction and Determination 7 The process of protein tertiary and quaternary structure (3D structure) are barely on their "baby steps", even secondary structure predictions are not yet entirely accurate. However they show much promise and will sure revolutionize the process of understanding interactions in complexes that go on during regulatory mechanisms and possibly find new solutions to ailments related with protein malfunction. Soon, there may be new algorithms that are very accurate, and thus crystallography won't be the only means to determine protein structures very accurately, and how it the protein may behave due to the structure. Protein Structure Prediction and Determination 8 References Berdini, V., O'Reilly, M., Congreve, M. S., & Tickle, I. J. (2007). Fragment-based screening by X-ray crystallography. Structure-based drug discovery (pp. 99-127) Springer Netherlands. Fassler, J., & Cooper, P. (2011, July 14). BLAST glossary. Retrieved December 6, 2014, from http://www.ncbi.nlm.nih.gov/books/NBK62051/ Guo, J., Ellrott, K., & Xu, Y. (2007). A historical perspective of template-based protein structure prediction. In M. J. Zaki, & C. Bystroff (Eds.), Protein structure prediction (pp. 3-42) Humana Press. doi:10.1007/978-1-59745-574-9_1 Ilari, Andrea C. S. (2008). Protein structure determination by X-ray crystallography. In P. D. Jonathan M. Keith (Ed.), Bioinformatics methods in molecular biology (Volume 452 ed., pp. 63-87) Humana Press. Madden, T. (2003, August 13). The BLAST sequence analysis tool. Retrieved December 4, 2014, from http://www.ncbi.nlm.nih.gov/books/NBK21097/ Miller, Katherine. "A Crescendo of Protein Structures." Biomedical Computation Review. 1 June 2005. Web. 7 Dec. 2014. http://biomedicalcomputationreview.org/content/crescendoprotein-structures. Miller, Katherine. "A Crescendo of Protein Structures." Biomedical Computation Review. 1 June 2005. Web. 7 Dec. 2014. “Protein.” Wikipedia. Wikimedia Foundation, 12 June 2014. Web. 7 Dec. 2014. <http://en.wikipedia.org/wiki/Protein>. (for images) Protein Structure Prediction and Determination Zhang, Q. C., Petrey, D., Deng, L., Qiang, L., Shi, Y., Thu, C. A., & ... Honig, B. (2012). Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature, 490(7421), 556-560. doi:10.1038/nature11503 9 Protein Structure Prediction and Determination 10 (Miller, 2005) Protein Structure Prediction and Determination 11 (“Protein”, 2014) (“Protein”,2014)