Download Geuvadis RNA sequencing Aims and analyses

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Behavioural genetics wikipedia , lookup

Gene expression programming wikipedia , lookup

Neurogenomics wikipedia , lookup

Heritability of IQ wikipedia , lookup

Time series wikipedia , lookup

Transcript
Geuvadis RNAseq analysis at UNIGE
Analysis plans
Tuuli Lappalainen
University of Geneva
Geuvadis Analysis Group Meeting, April 16 2012, Geneva
What we will do: Overview

Coordinate everything

Get the data together: QC, normalization, data sharing

Regulation quantitative trait loci (rQTL): Common and rare cis-regulatory variants

Participate in Loss-of-Function analyses

Functional annotation of both common and rare regulatory variants

Population and evolutionary genetic analyses
Genetic effects on regulatory variation
trans-eQTLs
independent
effects
miRNA/mRNA
interactions
common/rare
cis-variants
splicing QTLs
eQTL analysis
ASE analysis
splicing QTL analyses
Fine-mapping the causal regulatory variants
Finding many needles and little hay


Technical variation reduces our power in eQTL analysis: correction of covariates such as
library size, sequencing batches, GC content, % mapping reads…

Linear regression of covariates

Linear regression of ~10 PCs that are expected to be some sort of summaries of
technical covariates
Population stratification may lead to false genetic associations


analyze EUR & YRI separately and correct for population structure within EUR with
Eigenstrat
Reference allele mapping bias
SNP INDEL cSNP
ALT reads map
worse or not at all
simulation results of biased
reads & sites
remove from ASE
reference genome
test: filter biased reads from
sams, redo quantifications &
eQTL analysis
eQTLs : genotype association to regulatory
phenotypes
The classical cis-eQTL analysis:


all genetic variants >5% MAF

1MB from transcription start site

Spearman rank correlation with (normalized) exon read counts

permutations to assess significance
Expect a few thousand genes with an eQTL
Taking the eQTL approach further

Other phenotypes:

Gene expression levels: exon read counts or transcript
quantifications?

splicing variation: links between exons (Halit Ongen @
UNIGE), Barcelona’s transcript ratios

miRNA quantifications

Variation QTLs: variation between independent measures of
an individual’s gene expression levels = stochastic variation
in gene expression
expr variance

genotype
Independent regulatory variants
affecting the same gene

Regress out the first eQTL
effect and redo the analysis

How to integrate eQTLs –
sQTLs – vQTLs - miQTLs?
transeQTLs
independent
effects
miRNA/mRNA
interactions
common/rare
cis-variants
splicing
QTLs
ASE analysis
cis eQTL*
coding SNP
mRNA-sequencing
T
G
T
A
C
T T
T
T
Statistical testing
for ASE
Is the allelic ratio
different from 0.5 /
0.5?
C
C
Thousands of data
points per individual
Less noisy than
expression levels
No direct information of
the causal variant
ASE applications : population genetics of
regulatory effects
Clustering of individuals (and populations)
Expression distance
ASE distance
Genetic distance
Epistasis between regulatory and coding
variants
Deficiency of putatively deleterious coding
variants with high expression of the
derived allele
(Lappalainen et al. 2011)
ASE applications : rare regulatory variants
POOL OF INDIVIDUALS
Sharing of rare ASE effect leads to excess of sharing
of the haplotype
NO ASE
We have developed a statistical method to look for
ASE-genotype concordance to characterize rare
regulatory variants (Montgomery et al. Plos Genetics
2011)
NO ASE
NO ASE
NO ASE
ASE
ASE
Stephen Montgomery
Functional annotation of regulatory variants

Functional annotation of the genome: 1000g annotations,
ENCODE, conservation, etc
-> overlap with rQLTs

Can we finally get the causal variants?