Download Divining Biological Pathway Knowledge from High

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Oncogenomics wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

X-inactivation wikipedia , lookup

Essential gene wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Copy-number variation wikipedia , lookup

Genetic engineering wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Public health genomics wikipedia , lookup

Pathogenomics wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

NEDD9 wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

History of genetic engineering wikipedia , lookup

Minimal genome wikipedia , lookup

Gene therapy wikipedia , lookup

Genomic imprinting wikipedia , lookup

Ridge (biology) wikipedia , lookup

Helitron (biology) wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Genome evolution wikipedia , lookup

Gene desert wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

The Selfish Gene wikipedia , lookup

Gene wikipedia , lookup

Gene nomenclature wikipedia , lookup

Epigenetics of human development wikipedia , lookup

RNA-Seq wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genome (book) wikipedia , lookup

Microevolution wikipedia , lookup

Gene expression programming wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression profiling wikipedia , lookup

Transcript
EGAN – Basic Ideas and Terminology
Jesse Paquette
2010-08-23
Biostatistics and Computational Biology Core
Helen Diller Family Comprehensive Cancer Center
University of California, San Francisco
(AKA BCBC HDFCCC UCSF)
Nodes
• A node is an item on
a graph
• EGAN contains two
types of nodes
– Entrez Gene nodes
• Represents a single
gene with a backing
Entrez Gene ID
– Association nodes
• Represents a
semantically-related set
of Entrez Gene nodes
Nodes are backed by web references
• Right-click on a gene node
– Description/summary
– Links to web references
Edges
•
•
An edge connects two nodes (as a line)
The node NRAS has an edge connecting it
to MAPK1, BRAF and MAPK signaling
pathway
– All those nodes are “neighbors” of NRAS
•
EGAN contains many types of edges
– Edges between gene nodes
• Protein-protein interactions (PPI)
– BRAF has a PPI with MAPK1
• PubMed co-occurrence
– NRAS and BRAF are mentioned in the same
article(s)
• Chromosomal adjacency
– Genes are adjacently located on the
chromosome
– Edges between gene and association nodes
• Show which genes belong to which gene sets
• All genes shown are members of the MAPK
signaling pathway
Most edges are backed by literature
• You link out to each article and pre-defined
search queries by right-clicking on each edge
• Reference counts can be displayed on edges
31 articles available that
discuss NRAS and BRAF
How to use EGAN
• Load your experiment results using the Launch EGAN Wizard
• Your data must be in the proper 3-column format
– ID, statistic (e.g. fold-change), p-value (or qvalue/FDR estimate)
• You should include all genes/proteins from your assay
– i.e. don’t do a p/q value cutoff beforehand!
How to use EGAN
• Remember to specify the
proper background of genes
– Chip-based experiments
• Keep all genes that were
available on your chip
– RNA-Seq experiments
• Keep all genes that have
transcript IDs
– Proteomics experiments
• Keep all with Protein IDs
– Multiple experiments
• Keep all genes that could
have been discovered as
significant by all experiments
How to use EGAN
• Find gene nodes of interest
– Use the Entrez Gene Node Table
Click column header to sort, then clickand-drag to select top gene rows
How to use EGAN
• Show selected genes on the Network View
• Using information from the “focused” experiment
– Gene node border color is relative to its statistic
– Gene node border width is relative to the –log(p-value)
• Run layout algorithms, investigate gene information
How to use EGAN
• Calculate hypergeometric enrichment for association nodes
– Lower p-values indicate association nodes that have a high degree of
overlap with the set of visible gene nodes
• Selectively show enriched association nodes
How to use EGAN
•
Think about how the gene nodes, edges and enriched association nodes relate to
your experiment
•
Remember to follow links to web references and literature
•
Consider different gene sets from your experiment
–
–
•
Change the p-value cutoff and see how the network and enrichments change
Investigate the up-regulated genes, the down-regulated genes and the combined set
Perform GSEA-like (AKA global, rank-based) enrichment
–
–
Must be specified beforehand in the Launch EGAN Wizard – see 10) SEED Enrichment
Note how the different enrichment algorithms compare/contrast
•
Construct a module that adds non-significant connecting genes to the network
•
Perform a combined analysis using the results of multiple experiments
–
–
•
Load multiple assay results
Compare your genes to gene lists from previous publications
Remember to save your gene lists as groups in EGAN and save snapshots of
interesting networks
More information
• See http://akt.ucsf.edu/EGAN/
– Post questions to the discussion forum
– Send an email to the developers