Download Amino Acids and Their Properties

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of molecular evolution wikipedia , lookup

Butyric acid wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Citric acid cycle wikipedia , lookup

Mutation wikipedia , lookup

Protein adsorption wikipedia , lookup

Protein wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Cell-penetrating peptide wikipedia , lookup

Bottromycin wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Peptide synthesis wikipedia , lookup

Protein (nutrient) wikipedia , lookup

Molecular evolution wikipedia , lookup

Proteolysis wikipedia , lookup

Metabolism wikipedia , lookup

Protein structure prediction wikipedia , lookup

Genetic code wikipedia , lookup

Biochemistry wikipedia , lookup

Expanded genetic code wikipedia , lookup

Transcript
Amino Acids and Their
Properties
Recap: ss-rRNA and mutations


Ribosomal RNA (rRNA) evolves very slowly
– Much slower than proteins
– ss-rRNA is typically used
So by aligning ss-rRNA of one organism with that
of another
– We can estimate relatedness
Amino Acid Substitutions

Recall we can align DNA & RNA sequences…


What does that mean?
We can also align two amino acid sequences

Can 2 nucleotides partially match?

Can 2 amino acids partially match?
Amino Acid Substitutions

Aligning sequences


Can 2 nucleotides partially match?
• Are some nucleotide mutations more
significant than others?
Can 2 amino acids partially match?
• Are some amino acid mismatches more
significant than others?
Amino Acid Substitutions


Can 2 nucleotides partially match?
• Significance of a nucleobase mutation
– Does name matter?
– Does location matter?
Can 2 amino acids partially match?
• Significance of an amino acid mutation
– Name? Location?
Sequence matching
and evolution rate

Proteins tend to evolve slower than DNA

Many DNA changes have no affect on a protein



A changed codon may map to the same amino acid
Non-coding DNA changes may have no effect
What does this mean for gauging the
relatedness of


humans and chimpanzees?
humans and fish?
Sequence matching
and evolution rate


Ribosomal RNA (rRNA) evolves very slowly
– Much slower than proteins
What might rRNA matching be good for
measuring the relatedness of?



humans and chimpanzees?
humans and fish?
humans and what?
Sequence matching
and evolution rate



Ribosomal RNA (rRNA) evolves very slowly
– Much slower than proteins
– ss-rRNA is typically used
• (what's that?)
However, different regions of ss-rRNA mutate at
different rates
(Ribosome images next)
The Ribosome
Source:
www.buzzle.c
om/articles/ri
bosomesfunction.html
Ribosomes: diagrams and images


...check images.google.com for:
– Ribosome diagram
– Ribosome structure
Videos
includehttp://www.youtube.com/watch?v=ID7tDAr39Ow
Recap: ss-rRNA and mutations


Ribosomal RNA (rRNA) evolves very slowly
– Much slower than proteins
– ss-rRNA is typically used
So by aligning ss-rRNA of one organism with that
of another
– We can estimate relatedness
Relatedness and Mutations



Much DNA mutates relatively quickly
Much ss-rRNA mutates relatively slowly
Much protein mutates at intermediate rates
– Let's focus on protein mutation next
Amino acid subsitutions

Some amino acids substitutions are
more likely than others

Why?
Amino acid substitutions

Some amino acids substitutions are
more likely than others

Why?
• Some are closer to others in terms of
nucleobase codons
• Some are closer in terms of resulting
protein function
Amino acid substitutions II

Substituting similar ones is likely to


Substituting dissimilar ones is likely to


Retain the protein structure and function
Change the protein structure and function
Similarity of amino acids means what?
Amino acid substitutions III

Similarity of amino acids means



similar physicochemical properties
Physicochemical:

Concerning the physical and chemical

Concerning physical chemistry
Physical chemistry:

Connecting macroscopic properties of substances with their molecular
properties
Amino acid physicochemical
properties



Nonpolar(Hydrophobic)
ACFGILMPVW
Polar (hydrophilic): NQSTY
Aromatic: FHWY



(having to do with 6-carbon rings)
Basic: HKR
Acidic: DE
(See http://www.bio.davidson.edu/courses/genomics/jmol/aatable.html
By way of contrast, can anyone think of a nonphysicochemical property of some amino acids?

Aromatic



Special type of ring-shaped molecule
Characterized by an unusual stabilizing
property
Aliphatic

Non-aromatic
Amino acid abbrevs.






G=glycine, P=proline, T=threonine,
A=alanine, …, but why the following??
F=phenylalanine
Y=tyrosine
N=asparagine
Q=glutamine
W=tryptophan
Scoring protein sequence alignments

Simple way:

Two matching (identical) amino acids score 1
Two mismatching (non-identical) ones score 0
Goal: maximize % of matching amino acids

Works well for very similar sequences



Example:



CADQH
CADPM
Alignment score=___
Scoring protein sequence
alignments II

Simple way ignores degree of similarity



better to account for degree of similarity!
Solution: substitution matrices
PAM (Accepted Point Mutation, but “PAM”
easier to say than “APM”) matrix


Developed in 1970s by Margaret Dayhoff
PAM1 matrix:

answers question, “if 1% of the amino acids in a sequence
change, at what rates would each amino acid be
substituted for each other one?”
Scoring protein sequence
alignments II


Substitution matrices
PAM (Accepted Point Mutation, but “PAM”
easier to say than “APM”) matrix

PAM1 matrix:

answers question, “if 1% of the amino acids in a sequence
change, at what rates would each amino acid be
substituted for each other one?”
– PAM2 matrix:
• Not 2%!
• Rather, 1%, twice
• What is the difference?
Scoring protein sequence
alignments II


Substitution matrices
PAM (Accepted Point Mutation, but “PAM”
easier to say than “APM”) matrix

PAM1 matrix:

answers question, “if 1% of the amino acids in a sequence
change, at what rates would each amino acid be
substituted for each other one?”
– PAM250 matrix:
• Not 250%, obviously
• Why “obviously”?
• It is 1%, repeated 250 times!
Scoring protein sequence
alignments II


Substitution matrices
PAM (Accepted Point Mutation, but “PAM”
easier to say than “APM”) matrix

PAM1 matrix:

answers question, “if 1% of the amino acids in a sequence
change, at what rates would each amino acid be
substituted for each other one?”
– PAM250 matrix:
• It is 1%, repeated 250 times!
– BLOSUM matrix is a popular type also
Scoring protein sequences:
PAM250

Here is PAM250
• source:
http://bioinfo.cnio.es/docus/courses/SEK2003Filogenias/seq_analysis/PAM250matrix.gif



CADQH
CADPM
Alignment
score=?
Scoring protein sequences:
BLOSUM62 (default in Blast 2.0)

Source=http://bioinfo.cnio.es/docus/courses/SEK2003Filoge
nias/seq_analysis/pairwise.html.
Why do self “substitutions”
have the highest numbers?
Why use PAM, BLOSUM, etc.?


Sequence similarity is related to evolutionary
distance
Simple base matching (match/not) may work
ok for closely related organisms



humans and chimps, for example
Amino acid matching works better as
evolutionary distance increases (why?)
We’d like to be able to assess relatedness of
organisms that diverged long ago

humans and worms, for example
Relatedness Long Ago



See images.google.com for
 domains of life
We still are not sure, but the 3-domain system seems
likely
But cladistics demands binary splits, so
 3 domains requires 2 splits, and
 2 domains are more related than the 3rd
Why use PAM, BLOSUM…(II)

Organisms that diverged long ago have


divergent analogous amino acid sequences
Since different amino acid substitutions
occur at different frequencies…



…we can measure relatedness back farther
…e.g. when the fraction of identical amino
acids is surprisingly low
…and the fraction of identical base pairs…

…is even lower
Comparing Sequences
with PAMs
(+ recap)
What does “PAM” mean?

PAM is considered an acronym for






Point Accepted Mutation
Accepted Point Mutation (original)
Percent Accepted Mutations
A point mutation is a substitution of 1
amino acid for another
An accepted mutation is one that is
passed down through the generations
Will a mutation be accepted if it is
helpful? Harmful? Neutral? Helpful in
some circumstances, harmful in others?
What Does PAM Mean, cont.

PAM has two meanings
PAM is a unit of evolutionary time
 PAM is kind of substitution matrix
 (The meanings are related)

PAM as a Unit of Time

A PAM is the amount of evolutionary
change resulting in:


1 amino acid mutation per 100 amino acids
It is an average over >>100 amino acids


…because mutations have randomness
After 1 PAM, will an organism have
exactly 1% of its amino acids different
from what they started out as?
PAM, Evolution, and Gaps

PAM ignores





Insertions
Deletions
Silent nucleotide substitutions (which are?)
PAM counts a change from A to B and
back to A as 2 accepted point mutations
2 sequences 200 PAMs apart will have
about 25% of amino acids the same!
PAM Matrices

They describe substitutability of amino acids,
based on empirical evidence




Empirical = experiential
The matrices are derived from repositories of
actual homologous sequences
A PAM 1 matrix is geared to best compare 2
sequences that are 1 PAM apart
A PAM 250 matrix is good for comparing quite
diverged sequences
 PAM 250 matrix is standard
Creating a PAM Matrix

Let fi be the frequency of amino acid i

We express fi as a fraction of the total

fi =

Frequencies range from…
0.091 (L) down to 0.014 (W)
The most common amino acid occurs about ____
times more commonly than the least


instances of i __
.
instances of any amino acid
Creating PAM matrix, cont.


Determine mutabilities of the amino acids
Some amino acids tend to change easily


If alanine’s mutability is set to 100



Others not
Serine’s mutability is 117 (highest, 1991 data)
Tryptophan’s mutability is 25 (lowest, 1991)
Let’s look more closely at mi . . .
Creating PAM matrix, cont.


Mutability is a number
Given an evolutionary interval of 1 PAM


let mi = # mutations of amino acid i
# instances of amino acid i
Alternatively,
mi = p (an instance of i mutates)

Are the formulas on the previous slide
identical?
Creating PAM matrix, cont.




Next, we break mi into constituent mi,j’s
That is, i mutates, but into j at what rate?
Use actual data from observed mutations
Populate a matrix of probabilities
The Diagonal


Values on the matrix diagonal do not
really describe i mutating into itself!
 (In reality, can that happen?)
They basically show


p (i does not mutate)
Thus, the columns add up to 1
Is the matrix on the last slide
Symmetric?
Are there about 1% changed?
PAM0

What do you think a PAM 0 matrix
might look like?
PAMn

Use matrix multiplication

PAM2 = PAM1 x PAM1
PAM3 = PAM2 x PAM1

PAM250? Do it 250 times!

PAM∞

What do you imagine a PAM∞ matrix
might look sort of like?
Logarithmicize

Actually, we take logarithms to get the usual
matrix from the probability matrices…

First, build another, reference matrix of
“expected” probabilities


Assume all amino acids are equally mutable
Also assume they mutate into each other in
proportion to their frequencies

(I.e., overall amino acid frequencies are maintained, but
otherwise they don’t care what they mutate into)
Logarithmicize


Now we have two matrices
Make a 3rd. Each entry is:

Observed probability
Expected probability
…we’re comparing reality to “if
mutations were truly random”
Take the log of each entry to make a 4th




An entry of 1 means 10x more mutations of that
type than expected
An entry of -1 means what?
Carrying On

We now use the matrix to measure
relative evolutionary distance