Download Regulatory variation and eQTLs

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Regulatory variation and its
functional consequences
Chris Cotsapas
[email protected]
Motivating questions
• How do phenotypes vary across individuals?
– Regulatory changes drive cellular and organismal
traits
– Likely also drive evolutionary differences
• How are genes (co)regulated?
– Pathways, processes, contexts
Regulatory variation
• What do “interesting” variants do?
• Genetic changes to:
–
–
–
–
–
–
–
–
Coding sequence **
Gene expression levels
Splice isomer levels
Methylation patterns
Chromatin accessibility
Transcription factor binding kinetics
Cell signaling
Protein-protein interactions
~88% of GWAS
hits are
regulatory
Genetic variation alters regulation
• Protein levels
– Maize (Damerval 94)
• Expression levels
– Yeast, maize, mouse, humans (Brem 02, Schadt
03, Stranger 05, Stranger 07)
• RNA splicing
– Humans (Pickrell 12, Lappalainen 13)
• Methylation and Dnase I peak strength
– Humans (Degner 12; Gibbs 12)
Genetics of gene expression (eQTL)
• cis-eQTL
– The position of the eQTL maps
near the physical position of the
gene.
– Promoter polymorphism?
– Insertion/Deletion?
– Methylation, chromatin
conformation?
• trans-eQTL
– The position of the eQTL does
not map near the physical
position of the gene.
– Regulator?
– Direct or indirect?
Modified from Cheung and Spielman 2009 Nat Gen
QT association
• Analysis of the relationship between a dependent or outcome
variable (phenotype) with one or more independent or
predictor variables (SNP genotype)
Yi = b0 + b1Xi + ei
Continuous Trait Value
Linear Regression Equation
Slope: b1
b0
Logistic Regression Equation
pi
ln (1-pi) = b0 + b1Xi + ei
(
)
0
1
Number of A1 Alleles
2
eQTL analysis: a GWAS for every gene
gene 1
gene 2
gene 3
gene 4
gene 5
gene N
Cis- eQTL analysis:
Test SNPs within a pre-defined distance of gene
1Mb
1Mb window
probe
gene
SNPs
1Mb
cis-eQTLs are rather common
Nica et al PLoS Genet 2011
Cis-eQTLs cluster around TSS
Stranger et al
PLoS Genet 2012
Open question
WHERE ARE THE TRANS eQTLS?
trans hotspots (yeast)
Brem et al Science 2002
Yvert et al Nat Genet 2003
Whole-genome eQTL analysis is an independent
GWAS for expression of each gene
gene 1
gene 2
gene 3
gene 4
gene 5
gene N
Issues with trans mapping
• Power
– Genome-wide significance is 5e-8
– Multiple testing on ~20K genes
– Sample sizes clearly inadequate
• Data structure
– Bias corrections deflate variance
– Non-normal distributions
• Sample sizes
– Far too small
But…
• Assume that trans eQTLs affect many genes…
• …and you can use multivariate methods!
Hore et al Nat Genet 2016
MHC class I; Hore et al Nat Genet 2016
Histone RNA processing; Hore et al Nat Genet 2016
trans-eQTL implies over-dispersion
Cross-phenotype meta-analysis
l=1
l¹1
l¹1
−log(p)
−log(p)
−log(p)
SCPMA ~
L(data | λ≠1)
L(data | λ=1)
Cotsapas et al, PLoS Genetics 2011
N = 50
N = 100
N = 150
N = 200
N = 250
N = 300
N = 350
N = 400
1.0
0.8
0.6
True positive rate
0.4
0.2
0.0
1.0
0.8
0.6
NCP
1
2
3
4
5
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
False positive rate
Brynedal et al AJHG to appear
Comparing eQTLs from three African populations
Brynedal et al AJHG to appear
Prediction 1
• Allelic effects should be conserved between
two populations
Genes
p1 < 0.05
Genes
p2 < 0.05
YRI
+
+
-
-
+
LWK
+
+
-
-
+
L vs Y
M vs Y
L vs M
Brynedal et al AJHG to appear
Prediction 2
• Target genes should overlap
Genes
p1 < 0.05
Genes
p2 < 0.05
L vs Y
M vs Y
L vs M
Brynedal et al AJHG to appear
Brynedal et al AJHG to appear
RNAseq, GTEx
NEXT-GEN SEQUENCING DATA
GTEx – Genotype-Tissue EXpression
An NIH common fund project
Current: 35 tissues from 50 donors
Scale up: 20K tissues from 900 donors.
Novel methods groups: 5 current + RFA
How can we make RNAseq useful?
• Standard eQTLs
– Montgomery et al, Pickrell et al Nature 2010
• Isoform eQTLs
– Depth of sequence!
•
•
•
•
Long genes are preferentially sequenced
Abundant genes/isoforms ditto
Power!?
Mapping biases due to SNPs
RNAseq combined with other techs
• Regulons: TF gene sets via CHiP/seq
– Look for trans effects
• Open chromatin states (Dnase I; methylation)
– Find active genes
– Changes in epigenetic marks correlated to RNA
– Genetic effects
• RNA/DNA comparisons
– Simultaneous SNP detection/genotyping
– RNA editing ???
Related documents