Download Genome variation informatics: SNP discovery, demographic

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Koinophilia wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Genetic code wikipedia , lookup

Oncogenomics wikipedia , lookup

Human genome wikipedia , lookup

Behavioral epigenetics wikipedia , lookup

SNP genotyping wikipedia , lookup

Genetic studies on Bulgarians wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genome evolution wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genomics wikipedia , lookup

Genetic drift wikipedia , lookup

Designer baby wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Metagenomics wikipedia , lookup

Tag SNP wikipedia , lookup

RNA-Seq wikipedia , lookup

Genetic engineering wikipedia , lookup

History of genetic engineering wikipedia , lookup

Behavioural genetics wikipedia , lookup

Genetic testing wikipedia , lookup

Heritability of IQ wikipedia , lookup

Population genetics wikipedia , lookup

Genome (book) wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Public health genomics wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Medical genetics wikipedia , lookup

Human genetic variation wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Computational Tools for Finding and
Interpreting Genetic Variations
Gabor T. Marth
Department of Biology, Boston College
[email protected]
http://clavius.bc.edu/~marthlab/MarthLab
Sequence variations (polymorphisms)
A reference sequence of the human
genome is available…
… but every individual is
unique, and is different
from others at millions of
nucleotide locations
genetic polymorphisms
Our research interests
1. How to find genetic polymorphisms?
2. How to use variation data to track our
pre-historic past?
?
?
?
?
3. How to utilize polymorphism data for
medical research?
Tools for polymorphism discovery
SNP discovery in clonal sequences
P( SNP ) 

all var iable
P( S1 | R1 ) P( S N | RN )
 ...
 PPr ior ( S1 ,..., S N )
PPr ior ( S1 )
PPr ior ( S N )
P( Si1 | R1 ) P( SiN | R1 )
S
...
 ...
 PPr ior ( Si1 ,..., SiN )


PPr ior ( SiN )
Si1 [ A ,C ,G ,T ] SiN [ A ,C ,G ,T ] PPr ior ( S i1 )
Redevelopment and expansion
Homozygous C
Heterozygous C/T
Automated detection of
heterozygous positions in diploid
individual samples
Homozygous T
(visit Aaron Quinlan’s poster)
Redevelopment and expansion
Discovery of short deletions/insertions (both bi-allelic
and micro-satellite repeats)
Redevelopment and expansion
• Improve the detection of very rare alleles by taking into account
recent results in Population Genetics (i.e. a priori, rare alleles are
more frequent than common alleles)
• Developing a rigorous statistical framework both for heterozygote
polymorphisms and INDELs
• Calculating a probability value that a SNP found in one set of
samples will also be present in another
• Complete software rewrite
• Graphical User Interface (GUI)
• Ease of use for small laboratories without UNIX expertise
Genetic and epigenetic changes in cancer
We want to develop tools for detecting
inherited polymorphisms and somatic
mutations in a variety of new data types,
representing both genetic and epigenetic
changes
nucleotide changes, short
insertions / deletions
copy number changes,
chromosomal rearrangements
changes in DNA
methilation, histone
modification
Human pre-history
Demographic history
European data
African data
bottleneck
modest but
uninterrupted
expansion
Tools for Medical Genetics
The polymorphism
structure of
individuals follow
strong patterns
http://pga.gs.washington.edu/
The international HapMap project
However, the variation
structure observed in the
reference DNA samples…
… often does not match the
structure in another set of
samples such as those used in a
clinical case-control association
study aimed to find disease
genes and disease-causing
genetic variants
Tools to test sample-to-sample variability
Instead of genotyping additional
sets of (clinical) samples with
costly experimentation, and
comparing the variation structure
of these consecutive sets directly…
… we generate additional samples
with computational means, based
on our Population Genetic models
of demographic history. We then
use these samples to test the
efficacy of gene-mapping
approaches for clinical research.
Tools to test sample-to-sample variability
experimental
sample
r2 (4-site composite #2)
1
0.8
0.6
0.4
0.2
0
0
computational
sample
(visit Dr. Eric Tsung’s poster)
0.2
0.4
0.6
r2 (data)
0.8
1
Tools to connect genotype and clinical outcome
genetic marker (haplotype)
in genome regions of drug
metabolizing enzyme
(DME) genes
computational prediction
based on haplotype
structure
functional allele (known
metabolic polymorphism)
clinical endpoint
(adverse drug reaction)
molecular phenotype (drug
concentration measured in
blood plasma)
The Computational Genetics Lab
http://clavius.bc.edu/~marthlab/MarthLab