Download PSIpred Input

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genetic code wikipedia , lookup

Non-coding DNA wikipedia , lookup

Community fingerprinting wikipedia , lookup

Molecular evolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Point mutation wikipedia , lookup

Protein structure prediction wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Structural alignment wikipedia , lookup

Homology modeling wikipedia , lookup

Transcript
Introduction to Bioinformatics Tutorial no. 7
Predicting protein structure
PSI-BLAST
PHDsec and PSIpred

PHDsec



PSIpred



Rost & Sander, 1993
Based on sequence family alignments
Jones, 1999
Based on PSI-BLAST profiles
Both consider long-range interactions
PSIpred Input
Input
sequence
Type of Analysis
PSIpred Input (2)
Filtering Options
Email address
GO!
PSIpred Output
Conf: Confidence (0=low, 9=high)
Pred: Predicted secondary structure (H=helix, E=strand, C=coil)
AA: Target sequence
Confidence level
Conf: 988766667637889999877999871289878877049963202468899999997887
Pred: CCCCCCCCCCHHHHHHHHHHHHHHHHHCCCCCCHHHCCCCCHHHCHHHHHHHHHHHHHHH
AA: MQRSPLEKASVVSKLFFSWTRPILRKGYRQRLELSDIYQIPSVDSADNLSEKLEREWDRE
10
20
30
40
50
60
Predicted structure
Conf: 742888731467888768899999999999999987557888998875227887303678
Pred: HHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCHHHH
AA: LASKKNPKLINALRRCFFWRFMFYGIFLYLGEVTKAVQPLLLGRIIASYDPDNKEERSIA
70
80
90
100
110
120
PHDsec Input (1)
Email address
Type of
prediction
Additional
output
Output
format
Reduce
processing
PHDsec Input (2)
Type (number) of
input sequences
Upload file
Enter sequence
Wait for results?
PHDsec Output (1)
Protein
classification
Structure
proportions
Amino acid
proportions
PHDsec Output (2)
Estimated
structure
Confidence
level
Structure
with high
confidence
PSI-BLAST

Position-Specific Iterative BLAST


Finds more distantly related sequences


Extension to BLASTP
Distant sequences with insignificant E values
Even in distantly related sequences,
important domains can be highly conserved

PSI-BLAST gives more weight to those
PSI-BLAST Profile


When close sequences are aligned –
areas of conservation.
Scoring matrix becomes position specific




Each column has a unique set of a.a.
frequencies.
Score is column specific, based on a.a.
frequency.
More frequent a.a. -> higher score.
A new sequence is scored based on the
new scoring matrix.
123456
AMTYQR
CTTYQS
SMTYQA
Position-Specific Scoring Matrix
A PSI-BLAST Iteration


Collect all database sequence segments
that have been aligned with query
sequence with E-value below set threshold
(default 0.01)
Construct position specific scoring matrix
for collected sequences. Rough idea:




Align all sequences to the query sequence as
the template.
Assign weights to the sequences
Construct position specific scoring matrix
Find sequences that mach the profile
Using PSI-BLAST (1)
Available
from main
BLAST page
Or switch on
in BLASTP
E value threshold for initial inclusion
in multiple alignment for profile
Using PSI-BLAST (2)
Align selected
sequences, generate
profile, search again
Number of results to
show next iteration
New result
Select whether to include
in next iteration