* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Survey
Document related concepts
G protein–coupled receptor wikipedia , lookup
Signal transduction wikipedia , lookup
Protein (nutrient) wikipedia , lookup
Magnesium transporter wikipedia , lookup
Protein phosphorylation wikipedia , lookup
List of types of proteins wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein moonlighting wikipedia , lookup
Protein structure prediction wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Transcript
The role of Artificial Neural Networks in Phage Research Mike Arnoult 9/30/2010 What is an Artificial Neural Network? Mathematical and computational model Motivated by biological neurons Trained by using features to learn patterns and commonalities Uses values of its neuron connections to classify an example Why Apply Artificial Neural Networks to Phage Research? The neural network can be trained to recognize features of phage proteins, and distinguish between them. I have trained ANNs to recognize and classify phage major capsid proteins What is a Bacteriophage? A virus that infects bacteria The most common biological entity on earth A major impact on any environment with Bacteria A type of virus with a highly unique structure, which injects its genome into a host, through its tail A possible alternative to Antibiotics in medicine How the ANN works: Why Apply Artificial Neural Networks to Bioinformatics? The Neural Network can be trained to recognize features of proteins, and distinguish between them. In my research, I will train Neural Networks to recognize phage major capsid or tail proteins. What I’ve done so far: I’ve collected Positive and Negative Data sets from NCBI Positive data sets included Phage Major Capsid Proteins and synonyms: Major Shell Protein Major Head Protein Major Coat Protein Major Procapsid Protein Major Prohead Protein… Negative data sets included phage proteins unrelated to Major capsid proteins Packaging proteins Spike proteins DNA and RNA Polymerase Assembly proteins Contractile Sheath proteins What I’ve done so far: I have written and used Perl scripts to filter the Training Data Any sequences with conspicuously incorrect GenPept annotations were removed from the positive data-set. All sequences with Major Capsid Protein related annotations were removed from the negative data-set. What I’ve done so far: I’ve turned the sequences into percent compositions of Amino Acids and side-chain groups, to Train Neural Networks The positive entries are labeled with a 1 and the negative entries are labeled with a –1. Using a Matlab Script, a random 20% of the positive data-set is set aside and used as a test set against the other 80%. What I’m doing now: To find which criteria are best suited to Training the Neural Network to recognize Phage Major Capsid Proteins… I am training neural networks using different characteristics of Amino Acid side-chains (Polar, Nonpolar, Aromatic, Positive and Negative) Adjusting parameters of the way the Matlab script trains Neural Networks. Classification of Known Sequences: The values are average percentages of correctly classified sequences, of 1000 separately trained Neural Networks . Amino Acid and Sidechain Percent Compositions used as features 92.9233% Amino Acid Percent Compositions used as features No Side chains What I’m going to do Soon: Test The Neural Networks using other Phage Major Capsid Proteins Ramy’s curated Phage Major Capsid Proteins Eventually verify the Neural Network predictions in the lab. THE END