Download Working with enriched gene sets in R

Working with enriched gene sets in R Peter Svensson Micheline Giphart-Gassler Harry Vrieling P-values of genes • Starting with a vector of p-values from – t.test(irradiated, control) – wilcoxon(irradiated, control) – lm(formula, data) Distribution of p-values • two-tailed Distribution of p-values • one-tailed Distribution of p-values • Proportion of unchanged genes, π0 library(qvalue) • (Storey&Tibshirani 2001) qvalue(pvals)$pi0 Annotation • Anntotation of the genes available from Bioconductor – MetaData for commercial arrays – AnnBuilder for homemade – Unigene name, code, symbol, entrez gene, GO terms, KEGG pathways, Pubmed ids... Gene Set Enrichment Analysis • Mootha et al, Nat Genet. 2003, 34:267 • Use the gene sets that are made by GO terms, KEGG terms, name containing ’kinase’, genes that cluster together • Make a vector of – all not in group -sqrt(G/(N-G)) – all in group sqrt(N-G/G) Running sum • The sum of the values in vector will be 0 • Plot the running sum: • The peak is at a point at p=0.1 GSEA • The enrichment score can be used to determine the importance of gene set. • Permutation technique to get significance. Hypergeometric probability • Used in dChip and DAVID. • Input is – # genes in the gene set (n), # genes on array (n+m) – # selected genes in the gene set (x), # selected genes (N) • dhyper() gives the density Selecting genes • Have to set a threshold, p0, for the p-values. p < p0 selected • p0 = 0.001 is not informative • p0 = 0.1 • at the maximum of the peak • dissect(pvals) – (BMC Bioinformatics, to appear) • Will get a p-value • Tested 4000 GO terms, need for correction for multiple testing p.adjust(pvals,”fdr”) • Look at significant terms, p<0.001 Cisplatin data • Mouse embryonic stem cells exposed to various doses (low, medium and high). Harvested at 0<t<24 • Low doses, early time points – Few genes changed – Few pathways changed • Indications of what will come Preprocessing • For internal use at www.medgencentre.nl/pla • Not updated • Code for working with widgets, definining MIAME-compliant object, AffyBatch (exprSet), doing tests, building linear models, correlation tests, GSEA • Updating together with Agata Meglicz. It will be improved soon. Demonstration cdf=“hgu133a” source(“gsea.R”) gsea() dissectGUI()

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Working with enriched gene sets in R