* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download EGAN - iPlant Pods
X-inactivation wikipedia , lookup
Oncogenomics wikipedia , lookup
Ridge (biology) wikipedia , lookup
Point mutation wikipedia , lookup
Genomic imprinting wikipedia , lookup
Pathogenomics wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
History of genetic engineering wikipedia , lookup
Public health genomics wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Genetic engineering wikipedia , lookup
Copy-number variation wikipedia , lookup
Genome evolution wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Saethre–Chotzen syndrome wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Genome (book) wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene therapy wikipedia , lookup
The Selfish Gene wikipedia , lookup
Helitron (biology) wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gene desert wikipedia , lookup
Gene expression programming wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Gene nomenclature wikipedia , lookup
Microevolution wikipedia , lookup
Gene expression profiling wikipedia , lookup
EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center University of California, San Francisco (AKA BCBC HDFCCC UCSF) EGAN • http://akt.ucsf.edu/EGAN/ Features – Downloadable Java application – • but could be re-composed as components for web service architecture – Graphics provided by Cytoscape; graph layout algorithms imported from open source – Data pre-loaded for analysis. Each data set must include assay id, a measure (e.g., correlation coefficient, expression level) and significance value (e.g., p value) – Currently for Human and Rat Genome, but other model species in August (including arabidopsis) • Key focus- interactive analysis of sets of genes – User identifies the sets interactively – Enrichment -- uses Fishers exact test to see whether genes in a pathway are “overrepresented” relative to chance selection. Based on hypergeometric distribution, an n choose k sampling distribution – Gene sets graphed based on relationships • Counts (simply connect each gene to others in the set– can graph multiple sets) • Protein-protein interaction • Co-occurrence in literature – Access to pub med literature and external links • For demos, slides, presentations http://akt.ucsf.edu/EGAN/documentation.php Producing insight from clusters and gene lists • Summarize: find enriched pathways (and other gene sets) – Hypergeometric over-representation • DAVID – Global trends • GSEA • Visualize: gene relationships in a graph – Protein-protein interactions • Cytoscape – Network module discovery • Ingenuity IPA – Literature co-occurrence • PubGene • Contextualize: pertinent literature • PubMed • Google • iHOP High-throughput experiments • EGAN applies to – – – – – – – – Expression microarrays aCGH SNP/CNV arrays MS/MS Proteomics DNA methylation ChIP-Seq RNA-Seq In-silico experiments • If parts of the output can be mapped to gene IDs – You can use EGAN Gene sets • EGAN contains a database of gene sets – You can also add your own – Download from MSigDB (Broad) • A gene set defines a semantically-meaningful subset of genes – – – – – – – – – Signaling or metabolic pathway Gene Ontology (GO) term Previously-reported gene list (“signature”) Cytoband Transcription factor targets miRNA targets Conserved domain Drug targets &c. Gene-gene relationships • EGAN contains – Protein-protein interactions (PPI) – Literature co-occurrence – Chromosomal adjacency – Kinase-target relationships The article will be shown in your default web browser. Finding Counts EGAN Summary: Exploratory Gene Association Networks • Methods: state-of-the-art analysis of clusters and gene lists – – – – – • User Interface: responds quickly to new queries from the biologist – – – – • Hypergeometric enrichment of gene sets Global trends of gene sets Graph visualization Literature identification Network module discovery Fluid adjustment of p-value cutoffs Point-and-click interface All data in-memory for immediate access Links to external websites Modular: integrates as a flexible plug-and-play cog – – – – – All data is customizable Proprietary data can be restricted to the client location Java runs on almost every OS (PC, Mac, LINUX) Can be configured and launched from a different application (e.g. GenePattern) Analyses can be scripted for automation Keys to getting the most out of EGAN • • Don’t panic! Load as much data as possible • • – Assay results for every gene – Multiple experiments – Pathways and gene sets • • MSigDB – Previously-published gene lists and clusters • Supplementary data • Oncomine • Think about the context of the experiment – Show appropriate genes on graph • Think about the semantic meaning of the enriched gene sets – Show appropriate gene sets on graph Follow links to literature Use appropriate Google/PubMed search queries Create high-quality reports – Save your custom gene sets – Export graph screenshots to PDF – Export tables with enrichment scores to Excel – Record details in your lab notebook Where to find EGAN • Website – http://akt.ucsf.edu/EGAN/ • 2010 paper in Bioinformatics – http://www.ncbi.nlm.nih.gov/pubmed/19933825