Download 46556-2-12118

International Biometric Society MIXED GRAPHICAL MARKOV MODELS AND HIGHER-ORDER CONDITIONING FOR INVESTIGATING THE GENETICS OF GENE EXPRESSION Inma Tur1, Alberto Roverato2 and Robert Castelo1 1 Universitat Pompeu Fabra, Barcelona, Spain 2 Università di Bologna, Italy The parallel measurement of the concentration, or expression, of RNA molecules for thousands of genes enables a large-scale unbiased profiling of a heritable cellular trait that mediates the genetic basis of complex phenotypes. The resulting data forms a highdimensional multivariate sample which, to a large extent, reflects the entire phenotypic state of cells, tissues and sometimes, even whole organisms. Unfortunately, expression-profiling technology also incorporates into these measurements additional sources of non-biological variability. Next to the heterogeneity produced by these sources of unwanted variation, indirect effects spread throughout genes as a result of genetic, molecular and environmental perturbations. From a multivariate perspective one would like to adjust for the effect of every of these factors to end up with a network model of direct associations connecting the path from genotype to phenotype through the intervening genes to study the genetics of gene expression and higher-level phenotypes. However, the large number p of genes and genetic loci to analyse as random variables exceeds by far the available number of multivariate observations n, precluding the direct application of classical multivariate techniques that start with a saturated model. Moreover, genetic effects emanating from discrete genotypes may act non-additively through allele dominance and/or mask each other between different loci, a phenomenon known as epistasis. We use the framework of mixed graphical Markov models (GMMs) and conditional Gaussian distributions to approach the analysis of the genetics of gene expression whose primary purpose is identify genomic regions responsible for expression variability, also known as expression quantitative trait loci (eQTL). By simulating this type of models we can learn how genetic additive effects on mixed genotype-gene interactions (eQTL) propagate through genes as function of the magnitude of the correlation in pure continuous gene-gene associations. Standard linear theory coupled with decomposability in mixed GMMs enables to perform an exact likelihood ratio test for the presence of mixed interactions between genotypes and gene expression profiles. We show that testing these associations exactly is critical when using higher-order conditional independences because the asymptotic condition of classical deviance tests following a chi-squared distribution under the null breaks under decreasing sample sizes and increasing interaction orders and conditioning sizes. We exploit the use of mixed GMMs and higher-order conditioning by means or limitedorder correlations and marginal distributions of dimension (q+2) < n that enable the analysis of this kind of data with p >> n. Applying these procedures on data from an experimental cross between two strains of yeast, allows us to learn that the larger genetic effects, caused by the engineered deletions in the genome of one of the strains, act on genes with a highnumber of gene-gene associations in the resulting estimate of the mixed GMM. International Biometric Conference, Florence, ITALY, 6 – 11 July 2014

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 46556-2-12118