Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Research for medical discovery at the Computational Genomics Laboratory at Boston College Biology Gabor T. Marth Department of Biology, Boston College [email protected] http://clavius.bc.edu/~marthlab/MarthLab We study genetic variations because… … they underlie phenotypic differences … cause heritable diseases and determine responses to drugs … allow tracking ancestral human history We are interested in various aspects of genetic variations… • how to discover inherited genetic polymorphisms and somatic mutations that lead to disease? • how to model human polymorphism structure to inform medical research? • how to select the best genetic markers for clinical case-control association studies? • how to use genetic markers to predict individual responses to drugs, including adverse drug reactions? 1. We build computer tools for variation discovery 1. inherited (germ line) polymorphisms predispose to disease the most common type of human polymorphisms are single-nucleotide polymorphisms (SNPs) and short insertion-deletions (INDELs) P( SNP ) all var iable P( S N | RN ) P( S1 | R1 ) ... PPr ior ( S1 ,..., S N ) PPr ior ( S1 ) PPr ior ( S N ) P( SiN | R1 ) P( Si1 | R1 ) S ... PPr ior ( Si1 ,..., SiN ) ... PPr ior ( SiN ) S i1 [ A ,C ,G ,T ] S iN [ A ,C ,G ,T ] PPr ior ( S i1 ) Marth et al. Nature Genetics 1999 we have developed a computer package, PolyBayes© , for accurate discovery of DNA polymorphisms in clonal sequences Recently received a 5-year research grant from the NIH to expand our SNP detection capabilities… Homozygous C Heterozygous C/T 1. for automated detection of somatic mutations in diploid individual samples (medical re-sequencing data) Homozygous T 2. for new data types produced by the latest, super-high throughput sequencing technologies 3. to address the informatics needs of detecting genetic and epigenetic changes in somatic cells that lead to cancer and that occur during cancer proliferation copy number changes, chromosomal rearrangements changes in DNA methilation, histone modifications 2. We quantify the demographic history of human populations using DNA variation data… stationary past collapse expansion bottleneck history present MD (simulation) 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0 0 0 AFS (direct form) 1 2 3 4 5 6 7 8 9 10 0 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 10 0.1 0.1 0.1 0.1 0.05 0.05 0.05 0.05 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 9 10 0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 … and build computational models of human ancestral demographic history that underlies present-day genome polymorphism structure European data African data genetic bottleneck modest but uninterrupted expansion 3. An large NIH project aims to map out human polymorphism structure to aid gene mapping… However, the variation structure observed in the reference DNA samples genotyped by the HapMap project… … often does not match the structure in another set of samples such as those used in clinical samples used to find disease genes and diseasecausing genetic variants … we build computational tools to help the selection of optimal genetic markers for clinical studies. Instead of genotyping additional sets of (clinical) samples with costly experimentation, and comparing the variation structure of these consecutive sets directly… … we generate additional samples with computational means, based on our Population Genetic models of demographic history. We then use these samples to test the efficacy of gene-mapping approaches for clinical research. 4. We develop methods to connect genotype and clinical outcome in pharmaco-genetic systems genetic marker (haplotype) in genome regions of drug metabolizing enzyme (DME) genes computational prediction based on haplotype structure functional allele (known metabolic polymorphism) clinical endpoint (adverse drug reaction) molecular phenotype (drug concentration measured in blood plasma)