Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Geuvadis RNAseq analysis at UNIGE Analysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva What we will do: Overview Coordinate everything Get the data together: QC, normalization, data sharing Regulation quantitative trait loci (rQTL): Common and rare cis-regulatory variants Participate in Loss-of-Function analyses Functional annotation of both common and rare regulatory variants Population and evolutionary genetic analyses Genetic effects on regulatory variation trans-eQTLs independent effects miRNA/mRNA interactions common/rare cis-variants splicing QTLs eQTL analysis ASE analysis splicing QTL analyses Fine-mapping the causal regulatory variants Finding many needles and little hay Technical variation reduces our power in eQTL analysis: correction of covariates such as library size, sequencing batches, GC content, % mapping reads… Linear regression of covariates Linear regression of ~10 PCs that are expected to be some sort of summaries of technical covariates Population stratification may lead to false genetic associations analyze EUR & YRI separately and correct for population structure within EUR with Eigenstrat Reference allele mapping bias SNP INDEL cSNP ALT reads map worse or not at all simulation results of biased reads & sites remove from ASE reference genome test: filter biased reads from sams, redo quantifications & eQTL analysis eQTLs : genotype association to regulatory phenotypes The classical cis-eQTL analysis: all genetic variants >5% MAF 1MB from transcription start site Spearman rank correlation with (normalized) exon read counts permutations to assess significance Expect a few thousand genes with an eQTL Taking the eQTL approach further Other phenotypes: Gene expression levels: exon read counts or transcript quantifications? splicing variation: links between exons (Halit Ongen @ UNIGE), Barcelona’s transcript ratios miRNA quantifications Variation QTLs: variation between independent measures of an individual’s gene expression levels = stochastic variation in gene expression expr variance genotype Independent regulatory variants affecting the same gene Regress out the first eQTL effect and redo the analysis How to integrate eQTLs – sQTLs – vQTLs - miQTLs? transeQTLs independent effects miRNA/mRNA interactions common/rare cis-variants splicing QTLs ASE analysis cis eQTL* coding SNP mRNA-sequencing T G T A C T T T T Statistical testing for ASE Is the allelic ratio different from 0.5 / 0.5? C C Thousands of data points per individual Less noisy than expression levels No direct information of the causal variant ASE applications : population genetics of regulatory effects Clustering of individuals (and populations) Expression distance ASE distance Genetic distance Epistasis between regulatory and coding variants Deficiency of putatively deleterious coding variants with high expression of the derived allele (Lappalainen et al. 2011) ASE applications : rare regulatory variants POOL OF INDIVIDUALS Sharing of rare ASE effect leads to excess of sharing of the haplotype NO ASE We have developed a statistical method to look for ASE-genotype concordance to characterize rare regulatory variants (Montgomery et al. Plos Genetics 2011) NO ASE NO ASE NO ASE ASE ASE Stephen Montgomery Functional annotation of regulatory variants Functional annotation of the genome: 1000g annotations, ENCODE, conservation, etc -> overlap with rQLTs Can we finally get the causal variants?