Download No Slide Title

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
“The instructions for assembling every organism
on the planet--slugs and sequoias, peacocks and
parasites, whales and wasps--are all specified in DNA
sequences that can be translated into digital
information and stored in a computer for analysis. As
a consequence of this revolution, biology in the 21st
century is rapidly becoming an information science...
...hypotheses will arise as often in silico as in vitro.”
Eric Lander, Science 287 (5459), 1777-1782
The Problem
…analysis of native state assembly of proteins.
• Protein function and folding are highly cooperative
processes,
• Amino acids that interact in these processes can be
close, or relatively distant in the 1o structure,
– identifying interacting residues in active sites, or
identifying interacting residues that yield discrete 3o
structure is difficult,
– these interactions are not obvious by scanning primary
sequence.
A Partial Solution
• Mutational analysis,
– clone the gene,
– alter the DNA sequence that
codes for specific residues,
– express the gene,
– check for function or
conformational fidelity.
Labor intensive.
Doesn’t indicate residue interactions.
A Better Solution
Thermodynamic Mutant Cycling Analysis
More later…but briefly…
Double mutation analysis,
– used to determine if two different residues (or
peptide fragments) interact.
Labor intensive.
Presently impossible to accomplish on a large scale.
A Bioinformatic Alternative…
...let Evolution do the dirty work.
Multiple Sequence Alignment (globins)
“Entropy” in a MSA
…the key to this paper.
• Think of amino acids as parts of a system that
follows the rules of thermodynamics,
– if there were no constraints, amino acid
frequency and distribution would tend to
randomness,
– however, natural selection constrains primary
sequence in living systems.
MSA and Entropy
Genomic Sequences
DNA Sequence:Reagent for the 21st Century
PDZ Domains (n = 274)
“Model” Protein Domain Family
• Evolutionarily conserved, especially in tertiary
structure,
– Ca atoms: root mean square deviation = 1.4 angstroms*,
• More diverged in sequence homology,
– averaging 24% AA sequence similarity.
• *Four high resolution crystal structures of
distantly related family members.
Post synaptic density protein (PSD95), Drosophila disc large tumor suppressor (DlgA), and Zonula occludens-1 protein (zo-1)
Structural Classification of Proteins
domains
PDZ domains are found in diverse
signaling proteins in bacteria, yeasts,
plants, insects and vertebrates. They bind
either the carboxyl-terminal sequences of
proteins or internal peptide sequences
PDZ domains consist of 80 to 90 amino
acids comprising six beta-strands (betaA to
betaF) and two alpha-helices, A and B,
compactly arranged in a globular structure.
Peptide binding of the ligand takes place in
an elongated surface groove as an
antiparallel beta-strand interacts with the
betaB strand and the B helix. The structure
of PDZ domains allows binding to a free
carboxylate group at the end of a peptide
through a carboxylate-binding loop between
the betaA and betaB strands.
Google:
SCOP
Pfam
Don’t Sweat the Formulas!
…English Translation: a measure of conservation can be
made by comparing the frequency of amino acids in the
column of a MSA, to a randomly filled column…
…expressed as a change in free energy.
Figure 1A
Black: amino acid frequency in a database of 36,498 proteins.
Gray: amino acid frequency in a database of 274 PDZ domains.
PDZ domain
AA 76
• AA 76 is known to be important in
determining ligand specificity,
- S/T - X- V/I - COO- - or - - F/Y - X- V/A - COO-
Antepenultimate AA in the ligand.
Figure 1B,C
PDZ MSA DGstat
Highly conserved.
DGstat = 3.83 kT*
Poorly conserved.
DGstat = 0.1 kT*
Figure 1D
76
99
Figure 1E, F
Coupled Sites?
…English Translation: you change the MSA by removing a subset of
peptides that have similar (or identical) amino acids in a specific
column…
…if the amino acid in the original column interacts with another part of the
peptide, you might expect to see a change in DGstat (DDGstat ) in another
column of the new MSA.
Perturbing the MSA
…extract subsets of low-entropy alignments.
Re-calculate DGstat in
the new MSA, look
for columns that
had a change in
DGstat.
AA 76
– removed all of the peptides that had a histidine at AA
76 in the MSA,
• Calculated the change in DGstat (D DGstat) at all positions.
Figure 2 B
AA 34
AA 63
AA 76
Figure 2C-F
29, 26
other side of ligand binding
33, 34, 80, 84
local
66, 57, 51
unexpected
in Silica, So Far, So What?
Show me the money...
H76Y
Statistical DDGstat
Experimental DDGstat
Fig. 3
FRET
Fluorescence Resonance Energy Transfer
Mutant Cycling Analysis (General)
…with FRET (Fluorescence Resonance Energy Transfer)
476/527 ratio m1
If not equal, then sites are coupled.
476/527 ratio
m1:m2
Please Note: this was a general
presentation, see slide 33 for the
application used in this paper.
Fig. 3
?
Fig. D: What is it, why is it included?
Figure 4
Attempt to map connectivity through the peptide.
Also performed analysis on POZ domain.
Conclusion
With growing sequence data from
evolutionary distant genomes, the mapping
of energetic connectivity for many fold
families should be a realistic goal.
Figure 4
Coupling Coefficient
(Mutant Cycling Analysis)
coupling coefficient =
k (wt:wt) x k (mut:mut)
k (wt:mut) x k (mut:wt)
...if there is no coupling, then the coupling
coefficient would approach unity.
Related documents