Download Working with enriched gene sets in R

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genomic library wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Transposable element wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Copy-number variation wikipedia , lookup

Epistasis wikipedia , lookup

Long non-coding RNA wikipedia , lookup

X-inactivation wikipedia , lookup

Genetic engineering wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

NEDD9 wikipedia , lookup

Oncogenomics wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Gene therapy wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Pathogenomics wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Public health genomics wikipedia , lookup

Gene nomenclature wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Essential gene wikipedia , lookup

Gene desert wikipedia , lookup

History of genetic engineering wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Gene expression programming wikipedia , lookup

RNA-Seq wikipedia , lookup

Genome evolution wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genomic imprinting wikipedia , lookup

Gene wikipedia , lookup

Ridge (biology) wikipedia , lookup

Minimal genome wikipedia , lookup

Microevolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genome (book) wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression profiling wikipedia , lookup

Transcript
Working with enriched gene sets in R
Peter Svensson
Micheline Giphart-Gassler
Harry Vrieling
P-values of genes
• Starting with a vector of p-values from
– t.test(irradiated, control)
– wilcoxon(irradiated, control)
– lm(formula, data)
Distribution of p-values
• two-tailed
Distribution of p-values
• one-tailed
Distribution of p-values
• Proportion of
unchanged genes, π0
library(qvalue)
• (Storey&Tibshirani 2001)
qvalue(pvals)$pi0
Annotation
• Anntotation of the genes
available from
Bioconductor
– MetaData for commercial
arrays
– AnnBuilder for homemade
– Unigene name, code,
symbol, entrez gene, GO
terms, KEGG pathways,
Pubmed ids...
Gene Set Enrichment Analysis
• Mootha et al, Nat Genet.
2003, 34:267
• Use the gene sets that are
made by GO terms, KEGG
terms, name containing
’kinase’, genes that cluster
together
• Make a vector of
– all not in group -sqrt(G/(N-G))
– all in group sqrt(N-G/G)
Running sum
• The sum of the values in vector will be 0
• Plot the running sum:
• The peak is at a point at p=0.1
GSEA
• The enrichment score
can be used to
determine the
importance of gene
set.
• Permutation technique
to get significance.
Hypergeometric probability
• Used in dChip and DAVID.
• Input is
– # genes in the gene set (n), # genes on array (n+m)
– # selected genes in the gene set (x), # selected genes
(N)
• dhyper() gives the density
Selecting genes
• Have to set a threshold, p0, for
the p-values. p < p0 selected
• p0 = 0.001 is not informative
• p0 = 0.1
• at the maximum of the peak
• dissect(pvals)
– (BMC Bioinformatics, to appear)
• Will get a p-value
• Tested 4000 GO terms, need for
correction for multiple testing
p.adjust(pvals,”fdr”)
• Look at significant terms, p<0.001
Cisplatin data
• Mouse embryonic stem cells exposed to
various doses (low, medium and high).
Harvested at 0<t<24
• Low doses, early time points
– Few genes changed
– Few pathways changed
• Indications of what will come
Preprocessing
• For internal use at
www.medgencentre.nl/pla
• Not updated
• Code for working with widgets,
definining MIAME-compliant
object, AffyBatch (exprSet),
doing tests, building linear
models, correlation tests, GSEA
• Updating together with Agata
Meglicz. It will be improved soon.
Demonstration
cdf=“hgu133a”
source(“gsea.R”)
gsea()
dissectGUI()