Download Amino acid substitution and protein structure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Artificial gene synthesis wikipedia , lookup

SR protein wikipedia , lookup

Expression vector wikipedia , lookup

Catalytic triad wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Gene expression wikipedia , lookup

Magnesium transporter wikipedia , lookup

Interactome wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Peptide synthesis wikipedia , lookup

Western blot wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Metabolism wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Point mutation wikipedia , lookup

Protein wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Homology modeling wikipedia , lookup

Genetic code wikipedia , lookup

Metalloprotein wikipedia , lookup

Biosynthesis wikipedia , lookup

Proteolysis wikipedia , lookup

Biochemistry wikipedia , lookup

Transcript
Protein Structure II
(end I3+I4-I6)
Amino Acid Substitution

Let us define “amino acid substitution”:


A viable mutation that changes a protein so that the amino
acid that was at some location becomes another amino acid
Are some amino acid substitutions more
likely than others?
Amino Acid Substitution

Let us define “amino acid substitution”:


Are some amino acid substitutions more
likely than others?


A viable mutation that changes a protein so that the amino
acid that was at some location becomes another amino acid
Name a table that describes some details
Do likelihoods depend on the protein, the
location in it, neither, or both?
Amino Acid Substitution

Let us define “amino acid substitution”:


Are some amino acid substitutions more
likely than others?


A viable mutation that changes a protein so that the amino
acid that was at some location becomes another amino acid
Name a table that describes some details
Do likelihoods depend on the protein, the
location in it, neither, or both?

Why?
3 Hard-to-Substitute Amino Acids

Cysteine (“cyst-e-een”)

Stabilizes structures via side chain bonds


Glycine


Has tiny side “chain” of a hydrogen atom
This makes glycine very flexible


Two cysteines make a cystine
Useful in very tight turns
Proline

Flips more easily than other amino acids
The Proline Example

“Proline [is] common in transmembrane helices”



S Yohannon et al., “Proline substitutions are not easily accomodated
in a membrane protein,” J. Mol. Bio. 2004, 341(1):1-6
“They...produce deviations from canonical helical structure”
Flips more easily than other amino acids
between cis and trans forms

This changes the shape and can affect folding
Patterns of conserved AAs can
imply facts about the fold

Consider this quote:


“…the alternation between blocks of
conserved residues and blocks where the
sequences are more variable…can be
interpreted in terms of probable secondary
structure elements alternating with surface
loops.” – Instant Notes Bioinformatics, Westhead et al., 2002
Would this refer to fibrous, globular, or
integral membrane proteins?
A Different Conservation Example




See handout (Westhead et al. Figure 2)
A4, I8, V1, L5 are hydrophobic
D2, E6, K3, S7 are hydrophilic
How many amino acids per turn?

End view hides the turns

but #1-8 tell the story
A Different Conservation Example




See handout (Westhead et al. Figure 2)
A4, I8, V1, L5 are hydrophobic
D2, E6, K3, S7 are hydrophilic
This is about 2 turns of an alpha-helix

End view hides the turns


but #1-8 tell the story
Average turn in alpha helix = 3.6 amino acids
Example (continued)


A4, I8, V1, L5 are hydrophobic
D2, E6, K3, S7 are hydrophilic

Many alpha-helices have one side facing inside and one
facing the outer, polar, water



Called an “amphipathic” alpha helix
Amphipathic: having both hydrophobic and hydrophilic parts
If a sequence shows alternating hydrophobic and
hydrophilic subsequences, that suggests



…an amphipathic alpha helix
What can you say about the length of the subsequences?
What kinds of AA substitutions will tend/not tend to occur?
Conserving Overall Structure

Sander and Schneider found that



for typical naturally occurring proteins…
surprisingly (to me) few identical amino acids
were needed to conserve structure
Let t(L) be the % of identically aligned amino
acids required to conserve structure

t(L)=290.15L-0.562

L is the length of the sequence


Does the % go up or down with greater length?
Let’s try a couple of examples
View Formats for Proteins I:
Wire Frame (or Line)
(I4)


Shows bonds as line segments
Does not show atoms

But you can figure out where atoms are



Where?
Consider alpha-conotoxin (see next slide)
Conotoxins are produced by poisonous snails
called cone shells
Alpha-conotoxin
A-CONOTOXIN
Cone shell catches, eats fish

Speaking of conotoxins…


Check out a movie of a snail catching and
eating a fish!
(It uses conotoxins as part of the process)


Downloaded from:
http://grimwade.biochem.unimelb.edu.au/cone/fish2.mov
Originally from “Neurex Corp - Science and Publications - Articles” http://grimwade.biochem.unimelb.edu.au/cone/envenom.html
Alpha-conotoxin

This is a type of conotoxin

There are also


Conotoxins: made by cone shells



omega-, mu-, delta-, kappa-conotoxins
A category of carnivorous, often beautiful snails
“Typically 12-30 amino acid residues in length”
“…highly constrained peptides due to their high
density of disulphide bonds.”

http://grimwade.biochem.unimelb.edu.au/cone/vencomp.html
View Formats for Proteins II:
Space filling

Each atom is shown as a sphere


Size of a sphere varies with sizes of its atom
Alpha-conotoxin again next…

What is the orientation compared to the wire frame
picture?
Alpha-conotoxin
View Formats for Proteins III:
Ball and Stick

Each atom is shown with a small sphere

Bonds shown with sticks


Alpha-conotoxin again…
What is the orientation compared to the
previous models?
Alpha-conotoxin
View Formats for Proteins IV:
Cartoon

Shows secondary structures

Alpha-conotoxin again…
Alpha-conotoxin
View Formats for Proteins: Cartoon
Compare the orientations!

That can also be called a ribbon diagram

Above is another take on cartoon vs. ribbon…

from http://www.sander.embl-ebi.ac.uk/tops/ExplainDetailed.html
Finding Functional Sites

Proteins have functional sites and “the
other stuff”

Functional sites are where the molecule

“binds”

(interacts with other molecules)
Finding Functional Sites II

Functional sites are where the molecule binds

Does the rest of the protein matter at all?

(why?)

We’d like to find these “active” sites

Some active sites are harder to know about than others

The largest cavity on a protein surface is often the active site

SURFNET is a program that can locate active sites
Identifying Active Site Function

Suppose SURFNET finds a likely site

Finding an active site is only step 1

Figuring out its function is then step 2
Identifying Active Site Function II

Step 2: Figuring out active site function


Similar sites tend to have similar functions
Find another protein with


Hypothesize that the active site of interest…



known function and similar active site
…has a similar function
Databases and sequence matching are key
Do you think phylograms could help?
Cladograms? Dendrograms?
Structural Alignment

(I5)
Sequence alignment of DNA often
works great

It tells us about relatedness of


When organisms are very divergent:


organisms, proteins, genomes…
homologous sequences align little or no better
than random sequences
What is a way to circumvent that?
Structural Alignment II

Superpose (superimpose) two molecules…

“…so that peptide backbones of structurally equivalent
residues lie close together in space.”
p. 144, Westhead et al.

This might be ambiguous. Does it refer to



Then,


1. backbones made of residues
…or…
2. the peptide backbone parts of residues
sequence align the structurally aligned segments
This often works because structure tends to be
conserved more than sequence over time
Source:
glinka.bio.neu.edu/SEDB/Examples/Structure_search_FRIEND.JPG
Structural Alignment: Example 2
Structural
Alignment:
Example 3
Source:
a, Structure-based alignment of the hOGG1
sequence with those of E. coli AlkA27, 30,
E. coli endonuclease III (refs 26, 28) and
E. coli MutY29. Secondary structure
assignments are listed above the primary
sequence with -helices highlighted by
cylinders and -sheets highlighted as
arrows. The highly conserved HhH–GPD
motif is shown in orange. Residues in
hOGG1 are highlighted as follows: the
catalytic Lys 249 and Asp 268 are boxed;
residues that interact with the oxoG and
estranged cytosine are red and blue,
respectively; residues making DNA
backbone contacts are green. b, The
conserved HhH–GPD motif (orange) in
structurally characterized members of the
HhH–GPD superfamily.
Source:
Structural basis for recognition and repair of the
Steven D. Bruner, Derek P. G. Norman
and Gregory L. Verdine
Nature 403, 859-866 (24 February 2000)
Another Kind of Structural Alignment
Source:
www.herner.hu/daniel/shaolin.html
A Last Example
Measuring Structural Alignment

One approach is RMSD




RMSD = Root Mean Square Deviation
Superpose the two proteins in 3-D
Identify aligned residues
Measure the distance between each pair of aligned residues



Average the distances
(Actually, average the squared distances)


(distance between their alpha-carbon atoms)
Then take the square root of the result
RMSD
=square Root of the Mean of the Squares of the
Deviations

Take a minute to think about this now…
Measuring Structural Alignment

RMSD=square Root of the Mean of the
Squares of the Deviations

“…a small RMSD computed over a large number of
residues (N ) is more significant than a small
RMSD computed over a small number of residues.”
– Westhead et al., p. 145

Why?
RMSD 
1
N
d
i
2
i
Structural Alignment Redux


Why are both
 structural alignment, and
 sequence alignment
useful for comparing proteins?
Which is not useful for DNA comparisons?
 Why?
Protein Structure
Classification
(I6)

Protein structure tends to be conserved

Therefore, classification can show relatedness between



very different organisms
very different homologs
For example, if “classification” produced trees


Then branching near the root implies ancient splits
…and branching near the leaves implies recency


Just like other kinds of evolutionary dendrograms
Of all the things “classification” could mean, this is
what it means here!
Protein Structure Classification:
CATH and SCOP

CATH and SCOP are systems of
classification for proteins



They use structure
They produce trees (dendrograms)
They work differently, but produce similar
results (why?)
CATH


Main classification levels
Class 1: Mainly-Alpha

Class 2: Mainly-Beta

Class 3: Mixed Alpha-Beta

Class 4: Few Secondary Structures
(slightly modified from
http://www.cathdb.info/cgi-bin/cath/GotoCath.pl?link=
CATH II

CATH has that and 3 other levels of
classification (totalling 4):





Each level is a level of a tree



Class
Architecture
Topology
Homologous superfamily
Root is at top (the 0th level)
Class is the 1st level, etc. (let’s draw)
Can you guess why the system is called
CATH?
Small Part of the CATH Tree
Source: the CATH Website,
www.cathdb.info
CATH III

Classes are based on secondary structure





Mainly-alpha
Mainly-beta
Alpha-beta (mixed)
Low secondary structure content
What do alpha & beta refer to?
What is homology?

Homologous proteins
 Are evolutionarily related
 Typically share sequence similarities
 Typically share structural similarities
 Might share no significant similarities
CATH IV

Recall the four top levels of the CATH tree


Class, Architecture, Topology, Homologous superfamily
A node at which level represents a category of
proteins thought to be evolutionarily related?
SCOP

Has a tree, like CATH



Tree is significantly different, however…yet…
Proteins classified as homologs by one tend to
be classified as homologs by the other (why?)
http://scop.mrc-lmb.cam.ac.uk/scop/