Download Protein Structure Prediction (10 points total)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Drug design wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Drug discovery wikipedia , lookup

Paracrine signalling wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Point mutation wikipedia , lookup

Multi-state modeling of biomolecules wikipedia , lookup

Biochemistry wikipedia , lookup

Gene expression wikipedia , lookup

Magnesium transporter wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Expression vector wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Metalloprotein wikipedia , lookup

Protein wikipedia , lookup

Bimolecular fluorescence complementation wikipedia , lookup

QPNC-PAGE wikipedia , lookup

Interactome wikipedia , lookup

Structural alignment wikipedia , lookup

Protein purification wikipedia , lookup

Western blot wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Proteolysis wikipedia , lookup

Transcript
Problem I
Definitions
Provide a BRIEF description for the terms listed below:
Smith-Waterman
BLOSUM62
UPGMA
Parsimony
ddA (di-deoxyA)
TBLASTX
Shannon entropy
Pseudoknot
Pseudocount
rotamer library
Problem II
Secondary sequence analysis
In secondary structure analysis, both the Chou-Fasman
algorithm and the Garnier-Osguthorpe-Robson (GOR) methods
are inherently statistical in nature. However, the ChouFasman method is sometimes described as having some
“physical principles” contained within it, while the GOR is
sometimes described as an “information theory”-style
approach.
a.
Describe the basis for the Chou-Fasman method and
explain why some describe it as having some “physical
principles” within it.
b. Describe the basis for the GOR secondary structure
predictions and explain why some refer to it as an
information-theory style approach.
c. JPRED is a consensus-based approach to secondary
structure predictions. Explain what this “consensus-based”
term means and explain why this approach gives the highest
overall accuracy in predicting secondary structure of
proteins.
Problem III
Protein Structure Prediction
Here is information about target T0140 from CASP5 (same as
that that given to assessment participants).
CASP5 Target T0140
1. Protein Name: 1b11
2. Organism Name: Synthetic protein
3. Number of amino acids (approx): 103
4. Accession number:
5. Sequence Database:
6. Amino acid sequence:
MRGSHHHHHHGSRLQSGKMTGIVKWFNADKGFGFITPDDGSKDVFVHFSAGSSGAAVRGN
PQQGDRVEGKIKSITDFGIFIGLDGGIDGLVHLSDISWAQAEA
7. Additional Information
1b11 is a synthetic protein constructed by non-homologous
recombination. The N-terminal part derives from cold
shock protein A (CspA), while the C-terminal segment
comes from the E.coli 30S ribosomal subunit protein S1.
(Riechmann L, Winter G. Novel folded protein domains
generated by combinatorial shuffling of polypeptide
segments. Proc Natl Acad Sci U S A. 2000 Aug
29;97(18):10068-73.)
8. Crystallization conditions: include MES pH5.6
The protein is a tetramer under native conditions, but after
denaturation, elutes at approximately the molecular
weight of dimer on gel filtration.
9. X-ray structure
yes
10. Current state of the experimental work: Completed
11. Interpretable map?: yes
12. Estimated date of chain tracing completion: June
13. Estimated date of public release of structure:
September
14. Name: unavailable until after public release of
structure
Here is the abstract from the Riechmann & Winter article
describing how target T0140 was constructed:
It has been proposed that the architecture of protein
domains has evolved by the combinatorial assembly and/or
exchange of smaller polypeptide segments. To investigate
this proposal, we fused DNA encoding the N-terminal half
of a beta-barrel domain (from cold shock protein CspA)
with fragmented genomic Escherichia coli DNA and cloned
the repertoire of chimeric polypeptides for display on
filamentous bacteriophage. Phage displaying folded
polypeptides were selected by proteolysis; in most cases
the protease-resistant chimeric polypeptides comprised
genomic segments in their natural reading frames.
Although the genomic segments appeared to have no
sequence homologies with CspA, one of the originating
proteins had the same fold as CspA, but another had a
different fold. Four of the chimeric proteins were
expressed as soluble polypeptides; they formed monomers
and exhibited cooperative unfolding. Indeed, one of the
chimeric proteins contained a set of very slowly
exchanging amides and proved more stable than CspA
itself. These results indicate that native-like proteins
can be generated directly by combinatorial segment
assembly from nonhomologous proteins, with implications
for theories of the evolution of new protein folds, as
well as providing a means of creating novel domains and
architectures in vitro.
a. Describe the three general strategies used for structure
prediction in CASP and when each is appropriate. b.
Outline how you would go about predicting the structure of
target T0140, justifying your choice of methods. Describe
at least 5 sequential steps you would take.
c. Describe two specific challenges where improvements are
needed in protein structure prediction today.Problem IV
Protein Structure Modeling Approaches
For many approaches to protein structure analysis there are
two key parts to the problem: (1) how to search the
relevant sequence or structure space, and (2) how to
evaluate, or score, which sequence or structure is best.
a. Give examples of three distinct problems that we studied
in the protein structure part of the class. For each,
describe a search algorithm and a scoring function that can
be used in combination to address it.
b. Search methods sometimes constrain the energy functions
that can be used, and vice versa. Give an example of a
scoring function and a search method that can NOT be used
together, and describe why they are incompatible.Problem V
Modeling of simple reactions
Consider the following reaction
in which the forward rate constant is k1 and the reverse
rate constant is k-1
a. On the rate balance plot below, draw lines indicating
the forward reaction and the reverse reaction (and label
them accordingly). Indicate the steady state points on the
graph. Is this system bistable or monostable?
Rate
[A*]/([A]+[A*])
b. Consider simple linear feedback – that is, A* feeds back
and catalyzes the conversion of A to A*. On the ratebalance plot below, draw a curve representing this feedback.
Rate
c. Now draw a rate balance plot that includes results from
both simple linear feedback and the reverse reaction. For
simplicity, assume the forward reaction without feedback is
negligible. Indicate the two equilibrium states. Are they
both stable – that is, can simple linear positive feedback
as illustrated here generate a bistable system? Explain why
in a sentence or two.
Rate
[A*]/([A]+[A*])