Download Protein Structure Prediction With Evolutionary Algorithms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Biomolecular engineering wikipedia , lookup

Biology wikipedia , lookup

Expression vector wikipedia , lookup

Fluorescent glucose biosensor wikipedia , lookup

Protein phosphorylation wikipedia , lookup

Protein moonlighting wikipedia , lookup

Animal nutrition wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Interactome wikipedia , lookup

Biochemistry wikipedia , lookup

Folding@home wikipedia , lookup

Puppy nutrition wikipedia , lookup

Western blot wikipedia , lookup

Chemical biology wikipedia , lookup

Protein purification wikipedia , lookup

History of molecular biology wikipedia , lookup

Protein (nutrient) wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Protein adsorption wikipedia , lookup

Transcript
Protein Structure Prediction
With Evolutionary Algorithms
Natalio Krasnogor,
William Hart,
Jim Smith,
David Pelta,
U of the West of England
Sandia National Laboratories
U of the West of England
Universidad de Granada
Presenter: Elena Zheleva
Introduction


Problem Description
Biology Background
–
–

Genetic Algorithm (GA) Design Factors
–
–
–


Protein Folding
HP Protein Folding Model
Encodings for Internal Coordinates
Potential Energy Formulation
Constraint Management
Methods and Results
Conclusion
Problem Description




Computational Biology open problem: protein
structure prediction
Genetic algorithms have been used in the
research literature
Authors analyze 3 algorithm parameters that
impact performance and behavior of GAs
Goal: make suggestions for future algorithm
design
Outline


Problem Description
Biology Background
–
–

GA Design Factors
–
–
–


Protein Folding
HP Protein Folding Model
Encodings for Internal Coordinates
Potential Energy Formulation
Constraint Management
Methods and Results
Conclusion
Protein Folding




Proteins: driving force behind all of the
biochemical reactions which make biology
work
Protein is an amino acid chain!
Amino acid chain -> Structure of a protein
Structure of a protein -> Function of a protein
Protein Folding



Protein Folding: connection
between the genome
(sequence) and what the
proteins actually do (their
function).
Currently, no reliable
computational solution for
protein folding (3D structure)
problem.
Chemistry, Physics, Biology, CS
Outline


Problem Description
Biology Background
–
–

GA Design Factors
–
–
–


Protein Folding
HP Protein Folding Model
Encodings for Internal Coordinates
Potential Energy Formulation
Constraint Management
Methods and Results
Conclusion
HP Protein Folding Model



Amino acid chains
(proteins) are represented
as connected beads on a
2D or 3D lattice
HP: hydrophobic –
hydrophilic property
Hydrophobic amino acids
can form a hydrophobic
core w/ energy potential
HP Protein Folding Model

Model adds energy value e to each pair of
hydrophobics that are adjacent on lattice AND
not consecutive in the sequence

Goal of GA: find low energy configurations!
Outline


Problem Description
Biology Background
–
–

GA Design Factors
–
–
–


Protein Folding
HP Protein Folding Model
Encodings for Internal Coordinates
Potential Energy Formulation
Constraint Management
Methods and Results
Conclusion
Encodings for Internal Coordinates
Proteins are represented using internal
coordinates (vs. Cartesian)
 Absolute vs. Relative encoding
 Absolute Encoding: specifies an absolute
direction
n-1
cubic lattice: {U,D,L,R,F,B}
 Relative Encoding: specifies direction relative
to the previous amino acid
cubic lattice: {U,D,L,R,F} n-1

Encodings for Internal Coordinates




Encoding impacts global search behavior of GA
Example: One-point Mutations
Relative Encoding:
FLLFRRLRLLR->
FLLFRFLRLLR
Absolute Encoding:
RULLURURULU->
RULLUULULDL
Outline


Problem Description
Biology Background
–
–

GA Design Factors
–
–
–


Protein Folding
HP Protein Folding Model
Encodings for Internal Coordinates
Potential Energy Formulation
Constraint Management
Methods and Results
Conclusion
Potential Energy Formulation

Problem: same energy but different potential
(Picture )

Augment energy function to allow a distancedependent hydrophobic-hydrophobic potential
(Formula)
Outline


Problem Description
Biology Background
–
–

GA Design Factors
–
–
–


Protein Folding
HP Protein Folding Model
Encodings for Internal Coordinates
Potential Energy Formulation
Constraint Management
Methods and Results
Conclusion
Constraint Management


Methods for penalizing infeasible conformations
Method 1: Consider only feasible conformations
–

Weakness: shortest path from one feasible
conformation to another may be very long
Method 2: Fixed Penalty Approach
–
Violations:


–
2 amino acids lying on the same lattice point
Lattice point at which there are 2 or more amino acids
Penalty per violation = 2*number of hydrophobics + 2
(any infeasible conformation has positive energy)
Outline


Problem Description
Biology Background
–
–

GA Design Factors
–
–
–


Protein Folding
HP Protein Folding Model
Encodings for Internal Coordinates
Potential Energy Formulation
Constraint Management
Methods and Results
Conclusion
Methods and Results




1-point and 2-point Mutation operators
1-point, 2-point and Uniform Crossover
operators
5 polymer sequences (< 50 amino acids)
Each run of GA: 200 generations
Methods and Results

Relative vs. Absolute Encoding
(Diagram )
Distribution of relative ranks on the 3 lattices
Methods and Results




Standard vs. Distant Energy
Does the modified energy potential improve the
search capabilities of the GA?
No significant difference on test sequences
A guess: there might be on longer sequences
Conclusion




GAs applied to Protein Structure Prediction
problem have 3 important factors to consider
Relative encoding is at least as good as
absolute encoding, in some cases much better
Modified energy potential does not improve
search capabilities of GA
The proposed constraint/penalty method
ensures feasibility of the optimal solution
PE (Post Exhibitum)
PE
PE
PE
PE