Download Advancing research through protein sequence analysis

IBM Life Sciences division Accelerate results using algorithms to analyze protein molecules Advancing research through protein sequence analysis As a comprehensive resource for protein sequence analysis, PredictProtein incorporates several cutting-edge tools for protein structure and function prediction, including: • Algorithms providing prediction of: • Binding/active sites • Sub-cellular localization • Domain boundaries • Fold recognition • Inter-residue contacts Highlights ■ Predict the most and least useful The problem with traditional • Regions lacking regular structure proteomics research is that lab • Secondary structure experiments are expensive and take • Solvent accessibility too long to produce verified results, • Transmembrane, globular and gene product targets using even from a limited range of inputs. In computational tools the race to find answers, is it possible to accelerate the research process ■ Discover new leads through coiled coil regions • Disulfide-bonds • Algorithms providing annotation of: to cover more ground, predict the • Multiple sequence alignments improved interpretation of most—and least—promising targets • PROSITE sequence motifs existing data more accurately, and thus invest time • Low-complexity regions and money more successfully? ■ Decrease expense and improve Deep dive into the protein pool efficiency over traditional Life beyond lab data The algorithm at the core of research methods PredictProtein helps get the most out PredictProtein annotates protein of available lab data by extracting the secondary structure. The latest more reliable predictions from high- additions to the package include: from high-throughput analysis throughput methods and extrapolating data • from high-accuracy results. The localization of a protein and the system requires only protein sequence likelihood of a protein-DNA binding. By ■ Gauge meaningful annotations LOCtree: Predicts the subcellular to push the experiment into new incorporating a hierarchical ontology territory and to analyze data aspects of localization classes modeled onto to a previously unattainable degree of biological processing pathways as accuracy. PredictProtein analyzes the well as using structural information protein’s properties, observing the way from a secondary prediction, LOCtree that it folds and operates—its unique can deduce the protein’s cellular function in the cell. compartment location. LOCtree was 1 IBM Life Sciences division extremely successful at learning residues through whole-sequence evolutionary similarities among sub- computational mutagenesis studies. cellular localization classes and was • • MD: A neural-network based disorder significantly more accurate than other prediction method that finds proteins traditional networks at predicting sub- that are in a constant state of disorder— cellular localization. that is, those that might be antibodies ISIS: A prediction algorithm for or be involved in an antibody annotating regions of proteins that process--and annotates the part of are likely to bind to other proteins. the protein that is in the state of flux. ISIS identifies interacting residues It uses various orthogonal sources of from sequence and determines the information as input, particularly the likelihood for attaching a molecule outputs of different disorder predictors, at a certain amino acid. Although the and other properties that are correlated method is developed using transient with protein disorder including solvent proteins—protein interfaces from accessibility and residue flexibility. MD complexes of experimentally known is more accurate than its constituents 3D structures—it never explicitly uses and other top prediction methods. 3D information. Instead, it combines • predicted structural features with Backed by IBM and Biosof evolutionary information. The In collaboration with Biosof, which strongest predictions of the method markets the PredictProtein software, reached over 90% accuracy in a cross- IBM® provides secure infrastructure validation experiment. combined with cutting-edge hardware SNAP: A prediction method for to run these highly resource-intensive annotating the functional effects of algorithms. © Copyright IBM Corporation 2008 IBM Global Services Route 100 Somers, NY 10589 USA Produced in the United States of America 07-08 All Rights Reserved. IBM, the IBM logo, ibm.com and System x are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at ibm.com/legal/copytrade. shtml Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Other company, product and service names may be trademarks or service marks of others. References in this publication to IBM products and services do not imply that IBM intends to make them available in all countries in which IBM operates. The IBM home page on the Internet can be found at ibm.com non-synonymous SNPs. SNAP needs only sequence information as input, but IBM offers the broadest portfolio benefits from functional and structural of high-performance computing annotations if available. In a cross- solutions optimized for the needs of validation test on more than 80,000 proteomics with its System x™ clusters mutants, SNAP identified 80% of the and Cell Broadband Engine™-based non-neutral (loss and gain of function) architectures. Additionally, IBM Power substitutions at 77% accuracy and 76% technology clustered in large, dense of the neutral (functionally unaltered) SMPs can handle the most challenging substitutions at 80% accuracy. problems facing researchers and SNAP predictions can be used in scientists. Power technology is also annotating functionally important available in compact blade form factors. 2

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Advancing research through protein sequence analysis