* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Department of Health Informatics Telephone: [973] 972
Survey
Document related concepts
Transcript
Department of Health Informatics 65 BERGEN STREET, 3RD. FL., 350 UNIVERSITY HEIGHTS NEWARK, NJ 07107-3001 TELEPHONE: [973] 972-6871/6499 FAX: [973] 972-8540 Ph.D. Dissertation Defense STATISTICAL STRATEGIES FOR GENE MAPPING STUDIES OF COMPLEX DISEASE By Andrea Lynn Maes Ph.D. Program in Biomedical Informatics UMDNJ- School of Health Related Professions Research Advisor/Committee Chair: Scott R. Diehl, Ph.D. Professor of Oral Biology Director, Center for Pharmacogenomics and Complex Disease Research UMDNJ-New Jersey Dental School Committee: Tara C. Matise, Ph.D. Associate Professor of Genetics Rutgers University, Piscataway, NJ Committee: Masayuki Shibata, Ph.D. Associate Professor of Biomedical Informatics UMDNJ-School of Health Related Professions Monday September 10, 2007 1:00 p.m. D986, Oral Health Pavilion (NJDS) The University is an affirmative action/equal opportunity employer Abstract Disease gene mapping is a powerful strategy for uncovering the genetic basis of complex human diseases. Various methodological and statistical approaches for linkage and association analyses have been implemented to identify the genes underlying oligogenic traits. Careful consideration needs to be given to the design aspects of such studies in order to maximize their potential for detecting disease-causing variants. These include subject ascertainment and DNA marker map selection, as well as their effects on the statistical analysis of the data. This thesis first examines effects of pedigree structure, ascertainment, map density, and genotyping error on linkage analyses of affected sibling pairs (ASPs) when maps of either SNPs or microsatellite markers are used. The predictive power of the entropy-based information content (IC) for two common measures of linkage is explored under varying conditions of the above design characteristics. For genetic association studies, various study designs and statistical analysis methods that can handle both family-based and case-control data are compared. These approaches are contrasted with traditional family-based tests. Finally, a novel procedure for reducing the genotyping effort required for the analysis of pedigrees is explored. This method uses previously obtained linkage data in order to infer a subset of genotypes for a genetic association analysis. The primary conclusions are: i) unaffected siblings add as much or more power to a linkage study of ASPs as both parents; ii) the IC statistic is an insensitive predictor of linkage power; iii) clusters of tightly linked SNPs perform well for linkage; iv) a modest genotyping error rate of 1% is tolerable for linkage analysis of ASPs when additional family members are available for genotyping; v) statistical tests that accommodate both family and case-control data are powerful for detecting genetic association; vi.) enriching cases and controls based on family history of disease can provide very substantial increases in power for gene mapping by association; and vii) only limited increases in power for genetic association are obtained when inferring genotypes for a subset of a family, and the reduced cost of genotyping may make this strategy an inefficient approach. 2