Download Slide 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Paracrine signalling wikipedia , lookup

Clinical neurochemistry wikipedia , lookup

Biosynthesis wikipedia , lookup

Epitranscriptome wikipedia , lookup

Biochemistry wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Magnesium transporter wikipedia , lookup

SNP genotyping wikipedia , lookup

Expression vector wikipedia , lookup

Genetic code wikipedia , lookup

Gene expression wikipedia , lookup

Protein wikipedia , lookup

Bimolecular fluorescence complementation wikipedia , lookup

Interactome wikipedia , lookup

Metalloprotein wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Western blot wikipedia , lookup

Point mutation wikipedia , lookup

Protein purification wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Proteolysis wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Transcript
PolyPhen and SIFT: Tools for
predicting functional effects of SNPs
Epi 244
Spring 2009
Sam S. Oh
Human genome variation
• 3.2 billion base pairs (bp)
• 99.9% similarity across individuals
– 3.2 million bp dissimilar
• ~11 million SNPs
– Coding vs. non-coding (intron and intergenic regions)
– Most are synonymous
Frazer et al. Nat Rev Genet,
2009;10:241-251
DNA → RNA → Protein
Example: sickle-cell anemia
• A to T SNP of beta-globin gene results in
glutamate (hydrophilic) to valine
(hydrophobic) substitution
Example: MTHFR
• Folate metabolism
Finding MTHFR SNPs
Highlight all
refSNP numbers
(use scroll bar)
and copy
Note Build number
(currently Build 130)
Highlight all
refSNP numbers
(use scroll bar)
and copy
SIFT
• Sorting Intolerant From Tolerant
• Predicts tolerability of AA substitution
effects (i.e., non-synonymous SNPs)
based on
– Sequence homology
– Physical properties of amino acids
• Can be applied to naturally occurring
nonsynonymous polymorphisms and
laboratory-induced missense mutations
Compare Build numbers
Copy all SNP IDs and
paste into SIFT. Choose
“Submit Query”
Getting more info for rs2274974
Enter “rs2274974”
Flanking sequence,
IUPAC code,
Allele info
flanking seq
Build number
mRNA name
Protein name
Contig name
Position of SNP in
mRNA, protein, contig
Scroll down
Select protein
Note AA1, AA2,
and position
Copy FASTA-formatted
protein sequence
Paste FASTA-formatted
protein sequence
Enter AA substitution
[Letter1-position-Letter2]
Substitution occurs
at AA 566
Scroll down
Check tolerance of
AA substitutions
Scroll down
“Substitution at pos 566 from G to E is predicted to
AFFECT PROTEIN FUNCTION with a score of 0.01.
Tolerance of specified
substitution
Polymorphism Phenotyping
• Tool for prediction of possible impact of amino acid
substitution (i.e., non-synonymous SNPs) on protein
structure and function based on:
– Amino acid sequence
• What part of the protein did the SNP occur? (E.g., active site,
binding site, transmembrane region)
– Multiple alignments with homologous proteins and mammalian
orthologues
• How compatible is the substitution based on proteins of comparable
sequence?
– 3D structural properties with the substituted amino acid
• What is the substitution’s effect on the protein’s physiochemistry?
(E.g., hydrophobicity, electrostatic interactions, ligand binding)
PolyPhen data flow
Four potential predictions
• Probably damaging
– It is with high confidence supposed to affect protein
function or structure
• Possibly damaging
– It is supposed to affect protein function or structure
• Benign
– Most likely lacking any phenotypic effect
• Unknown
– Lack of data do not allow PolyPhen to make a
prediction
Copy FASTA-formatted
protein sequence
Enter AA position, ancestral
AA, and substituted AA
In dbSNP Build 129, corresponds to
protein NP_005948.3
Enter SNP rs#
Query vs. SNP Collection
Prediction
PSIC
db SNP Build#
Query
SNP Collection
Probably
damaging
2.093
Probably
damaging
2.172
N/A
126
References
• NCBI dbSNP
– http://www.ncbi.nlm.nih.gov/sites/entrez
• SIFT
– http://sift.jcvi.org/
• PolyPhen
– http://genetics.bwh.harvard.edu/pph/index.html