Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Knowledge Integration for Gene Target Selection Graciela Gonzalez, PhD Juan C. Uribe Contact: [email protected] GeneRanker in a Nutshell • Integration of knowledge from – biomedical literature – curated PPI databases, and – protein network topology • Seeks to prioritize lists of genes on their association to specific diseases and phenotypes [1], • Such associations may or may not have been published (thus, not text mining) [1] Gonzalez G, Uribe JC, Tari L, Brophy C, Baral C. Mining Gene-Disease relationships from Biomedical Literature: Incorporating Interactions, Connectivity, Confidence, and Context Measures. Pacific Symposium in Biocomputing; 2007; Maui, Hawaii; 2007. GeneRanker Interface 1. The user types a disease or biological process to be searched. 2. Genes found to be in association to the disease are extracted from the literature. 3. Protein-protein interactions involving those genes are then pulled from the literature & curated sources 4. The protein network is built and each gene ranked GeneRanker Interface Collaboration: Application of GeneRanker to a biological context, with Dr. Michael Berens, Director of the Brain Tumor Unit at the Translational Genomics Institute (TGen). GeneRanker is available as an online application at http://www.generanker.org. • Each gene is scored and can be annotated (count of co-occurrences and statistical representation) Evaluation of GeneRanker Mining genes related to gliom a: Precision by Method Ranked list (top 50) Ranked list (top 100) Ranked list (top 200) Gene-disease search Random List 0% 10% Related (>10 articles) • • • 20% 30% 40% 50% 60% Possibly Related (1 to 10 articles) 70% 80% 90% 100% No evidence of relation or not a gene Contextual (PubMed search) based shows > 20% jump in precision over NLP based extraction. Synthetic network results show AUC > 0.984 Empirical validation against a glioma dataset shows consistent results (118 vs 22 differentially expressed probes from top vs bottom of list) Complementary Work • CBioC: www.cbioc.org shows PPIs, gene-disease, and gene-bioprocess associations extracted from abstracts • BANNER: sourceforge.banner.org (presenting a poster on this one). An open source entity recognizer available now. • Gene normalization: a similar open source system soon to be available.