Download here.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Predicting The Beta-Helix Fold From
Protein Sequence Data
Phil Bradley, Lenore Cowen, Matthew Menke,
Jonathan King, Bonnie Berger
MIT
Structural Motif Recognition
Problem: Given a structural motif (secondary,
super-secondary, tertiary), predict its
presence from sequence data alone.
Example: Coiled-coil prediction (Berger et al. 1995)
GCN4 leucine zipper
Long Distance Correlations
Cyclophilin A
In beta
structures,
amino
acids close
in the
folded 3D
structure
may be far
away in
the linear
sequence
The Right-handed Parallel Beta-Helix
A processive fold
composed of
repeated supersecondary units.
Each rung
consists of three
beta-strands
separated by turn
regions.
Pectate Lyase C (Yoder et al. 1993)
No sequence
repeat.
Biological Importance of Beta Helices
Surface proteins in human infectious disease:
• virulence factors
• adhesins
• toxins
• allergens
Proposed as a model for amyloid fibrils
(e.g. Alzheimer’s and CJD)
Virulence factors in plant pathogens
What is Known
Solved beta-helix structures:
12 structures in PDB in 7 different SCOP families
Pectate Lyase:
Pectate Lyase C
Pectate Lyase E
Pectate Lyase
Pectin Lyase:
Pectin Lyase A
Pectin Lyase B
Galacturonase:
Polygalacturonase
Polygalacturonase II
Rhamnogalacturonase A
Chondroitinase B
Pectin Methylesterase
P.69 Pertactin
P22 Tailspike
Approaches to Structural Motif Recognition
General Methods:
Sequence similarity searches
Multiple alignments &
profile HMMs
Threading
Profile methods (3D & 1D)
-Heffron et al. (1998)
*Statistical Methods
BetaWrap Program
Performance:
• On PDB: no false positives & no false negatives.
Recognizes beta helices in PDB across SCOP
families in cross-validation.
• Recognizes many new potential beta helices when
run on larger sequence databases.
• Runs in linear time (~5 min. on SWISS-PROT).
BetaWrap Program
Histogram of protein scores for:
• beta helices not in database (12 proteins)
• non-beta helices in PDB (1346 proteins )
Single Rung of a Beta Helix
3D Pairwise Correlations
B3
T2
B2
B1
Aligned residues in
adjacent beta-strands
exhibit strong
correlations
Residues in the T2
turn have special
correlations
(Asparagine ladder,
aliphatic stacking)
3D Pairwise Correlations
B3
T2
B2
B1
Stacking residues in
adjacent beta-strands
exhibit strong
correlations
Residues in the T2
turn have special
correlations
(Asparagine ladder,
aliphatic stacking)
Question: how
can we find
these
correlations
which are a
variable
distance apart
in sequence?
Phage P22 Tailspike
Finding Candidate Wraps
• Assume we have the correct locations of a
single T2 turn (fixed B2 & B3).
B3 T2
Candidate
Rung
B2
• Generate the 5 best-scoring candidates for the
next rung.
Scoring Candidate Wraps (rung-to-rung)
Rung-to-rung alignment score incorporates:
• Beta sheet pairwise alignment
preferences taken from
amphipathic beta
structures in PDB.
(w/o beta helices)
• Additional stacking bonuses
on internal pairs.
• Distribution on turn lengths.
Scoring Candidate Wraps (5 rungs)
• Iterate out to 5 rungs generating candidate wraps:
• Score each wrap:
- sum the rung-to-rung scores
- B1 correlations filter
- screen for alpha-helical content
Key Features of Our Approach
• Structural model
• Statistical score
• Dynamic search
Predicted Beta Helices
Features of the 200 top-scoring proteins in the
NCBI’s protein sequence database:
•Many proteins of similar function to the known betahelices; some with similar sequences.
•A significant fraction are characterized as microbial
outer membrane or cell-surface proteins.
•Mouse, human, worm and fly sequences significantly
underrepresented – only two proteins!
Some Predicted Beta Helices in Human Pathogens
Vibrio cholerae
Helicobacter pylori
Plasmodium falciparum
Chlamyidia trachomatis
Chlamydophilia pneumoniae
Listeria monocytogenes
Trypanosoma brucei
Borrelia burgdorferi
Leishmania donovani
Bordetella bronchiseptica
Trypanosoma cruizi
Bordetella parapertussis
Bacillus anthracis
Rickettsia ricketsii
Rickettsia japonica
Neisseria meningitidis
Legionaella pneumophilia
Cholera
Ulcers
Malaria
Venereal infection
Respiratory infection
Listeriosis
Sleeping sickness
Lyme disease
Leishmaniasis
Respiratory infection
Sleeping sickness
Whooping cough
Anthrax
Rocky Mtn. spotted fever
Oriental spotted fever
Meningitis
Legionnaire’s disease
Predicted Beta Helices
False
positives?
Also present
in the top 200
proteins are
members of
the LRR and
hexapeptide
repeat
families.
LRR
Hexapeptide
repeat
Structural Features of Beta-Helices
•B2-T2-B3 region is
well-conserved.
•T1 and T3 turns highly
variable (from 2 to 63
residues in length).
•Active site is an
extended surface,
formed by T3, B1, T1.
A single rung of Pectate Lyase C
•Distinctive internal
stacking interactions.
Related documents