* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Genome variation informatics: SNP discovery, demographic
Koinophilia wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Genetic code wikipedia , lookup
Oncogenomics wikipedia , lookup
Human genome wikipedia , lookup
Behavioral epigenetics wikipedia , lookup
SNP genotyping wikipedia , lookup
Genetic studies on Bulgarians wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome evolution wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Genetic drift wikipedia , lookup
Designer baby wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Metagenomics wikipedia , lookup
Genetic engineering wikipedia , lookup
History of genetic engineering wikipedia , lookup
Behavioural genetics wikipedia , lookup
Genetic testing wikipedia , lookup
Heritability of IQ wikipedia , lookup
Population genetics wikipedia , lookup
Genome (book) wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Public health genomics wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Medical genetics wikipedia , lookup
Computational Tools for Finding and Interpreting Genetic Variations Gabor T. Marth Department of Biology, Boston College [email protected] http://clavius.bc.edu/~marthlab/MarthLab Sequence variations (polymorphisms) A reference sequence of the human genome is available… … but every individual is unique, and is different from others at millions of nucleotide locations genetic polymorphisms Our research interests 1. How to find genetic polymorphisms? 2. How to use variation data to track our pre-historic past? ? ? ? ? 3. How to utilize polymorphism data for medical research? Tools for polymorphism discovery SNP discovery in clonal sequences P( SNP ) all var iable P( S1 | R1 ) P( S N | RN ) ... PPr ior ( S1 ,..., S N ) PPr ior ( S1 ) PPr ior ( S N ) P( Si1 | R1 ) P( SiN | R1 ) S ... ... PPr ior ( Si1 ,..., SiN ) PPr ior ( SiN ) Si1 [ A ,C ,G ,T ] SiN [ A ,C ,G ,T ] PPr ior ( S i1 ) Redevelopment and expansion Homozygous C Heterozygous C/T Automated detection of heterozygous positions in diploid individual samples Homozygous T (visit Aaron Quinlan’s poster) Redevelopment and expansion Discovery of short deletions/insertions (both bi-allelic and micro-satellite repeats) Redevelopment and expansion • Improve the detection of very rare alleles by taking into account recent results in Population Genetics (i.e. a priori, rare alleles are more frequent than common alleles) • Developing a rigorous statistical framework both for heterozygote polymorphisms and INDELs • Calculating a probability value that a SNP found in one set of samples will also be present in another • Complete software rewrite • Graphical User Interface (GUI) • Ease of use for small laboratories without UNIX expertise Genetic and epigenetic changes in cancer We want to develop tools for detecting inherited polymorphisms and somatic mutations in a variety of new data types, representing both genetic and epigenetic changes nucleotide changes, short insertions / deletions copy number changes, chromosomal rearrangements changes in DNA methilation, histone modification Human pre-history Demographic history European data African data bottleneck modest but uninterrupted expansion Tools for Medical Genetics The polymorphism structure of individuals follow strong patterns http://pga.gs.washington.edu/ The international HapMap project However, the variation structure observed in the reference DNA samples… … often does not match the structure in another set of samples such as those used in a clinical case-control association study aimed to find disease genes and disease-causing genetic variants Tools to test sample-to-sample variability Instead of genotyping additional sets of (clinical) samples with costly experimentation, and comparing the variation structure of these consecutive sets directly… … we generate additional samples with computational means, based on our Population Genetic models of demographic history. We then use these samples to test the efficacy of gene-mapping approaches for clinical research. Tools to test sample-to-sample variability experimental sample r2 (4-site composite #2) 1 0.8 0.6 0.4 0.2 0 0 computational sample (visit Dr. Eric Tsung’s poster) 0.2 0.4 0.6 r2 (data) 0.8 1 Tools to connect genotype and clinical outcome genetic marker (haplotype) in genome regions of drug metabolizing enzyme (DME) genes computational prediction based on haplotype structure functional allele (known metabolic polymorphism) clinical endpoint (adverse drug reaction) molecular phenotype (drug concentration measured in blood plasma) The Computational Genetics Lab http://clavius.bc.edu/~marthlab/MarthLab