Download Structural Location of Disease-associated Single

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

NEDD9 wikipedia , lookup

Epistasis wikipedia , lookup

Non-coding DNA wikipedia , lookup

Microevolution wikipedia , lookup

Genetic code wikipedia , lookup

Behavioural genetics wikipedia , lookup

Mutation wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Human genome wikipedia , lookup

Genome (book) wikipedia , lookup

Molecular Inversion Probe wikipedia , lookup

Frameshift mutation wikipedia , lookup

RNA-Seq wikipedia , lookup

Human genetic variation wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Haplogroup G-M201 wikipedia , lookup

Public health genomics wikipedia , lookup

SNP genotyping wikipedia , lookup

Point mutation wikipedia , lookup

Tag SNP wikipedia , lookup

Transcript
Structural Location of Diseaseassociated Single-nucleotide
Polymorphisms
By Stitziel, Tseng, Pervouchine,
Goddeau, Kasif, Liang
JMB, 2003, 327, 1021-1030
Presented by Nancy Baker
What is a SNP?





Single nucleotide polymorphism – a single
base change
Most common form of human genetic
variation
500,000 SNPs in human coding region
nsSNPs (nonsynonymous cause amino
acid changes)
Can cause diseases in many different ways
Goal: is the location of a SNP
important?


Do disease causing
SNPs occur in one site
of a protein more
than others?
Possible geometric
sites:



Pocket or void
Convex or shallow
region
Interior (have 0
solvent accessibility)
Another goal: Get evolutionary
perspective


Are SNPs conserved?
Use HMM techniques.
Step 1: Find SNPs associated with
disease

OMIM (Online Mendelian Inheritence in
Man)




http://www.ncbi.nlm.nih.gov/entrez/query.fcgi
?db=OMIM
Picked OMIM SNPs with link to SwissProt
Extracted SwissProt sequences
Ended up with 2128 variants of 310 genes
Step 2: Control Dataset: SNPs not
necessarily associated with disease





dbSNP database is source
They admit this is not a perfect control
Extract sequences from Genbank
sequences
Use sequences to find structure entry in
PDB
End up with 973 variants on 504 genes
Step 3: Where is the SNP in the
protein?

Map to PDB structures



For OMIM SNPs – 924 variants in 82 alleles mapped
to 129 PDB structures
For dbSNP – 558 variants in 339 alleles mapped to
263 PDB structures
Classify locations:



P: surface pocket or interior void
S: convex or depressed regions
I: interior
Results
Location
OMIM
dbSNP
P:surface
pocket / void
88%
68%
S: convex /
shallow
9%
27%
I: interior
3%
5%
Results



Many disease-associated nsSNPs are located in
pockets or voids – more likely than non-disease
associated nsSNPs – binding pockets
nsSNPs in shallow depressed or convex regions
also cause disease - probably because these
can also be binding pockets
nsSNPs unlikely to be buried in protein – why?


Buried sites not accessible for molecular recognition
and binding
Core mutations either do not affect stability or affect
it so much the mutation is fatal – not in population
Results


For interior nsSNPs – no tendency for
disease-associated mutations to be
conserved
For SNPs in interior – disease-associated
SNPs more likely to be conserved
Value of paper


Makes use of available data – no lab work
involved
Provides data, but …



http://gila.bioengr.uic.edu/snp
Little vague on some methods
Control set

http://www3.ncbi.nlm.nih.gov/entrez/disp
omim.cgi?id=147670