* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Microevolution wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genome evolution wikipedia , lookup
Metagenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Minimal genome wikipedia , lookup
Molecular Inversion Probe wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genomic imprinting wikipedia , lookup
Synthetic biology wikipedia , lookup
Genomic library wikipedia , lookup
BeeSpace: An Interactive Environment for Functional Analysis of Social Behavior Bruce Schatz, Principal Investigator Graduate School of Library & Information Science (GSLIS) Department of Computer Science, Program in Neuroscience [email protected], www.canis.uiuc.edu Theme for Genomics of Neural and Behavioral Plasticity www.beespace.uiuc.edu IGB Thematic Research Seminar, November 2, 2004 INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Bee Counted – Vote Today! INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign BeeSpace FIBR Project BeeSpace project is NSF FIBR flagship Frontiers Integrative Biological Research, $5M for 5 years at University of Illinois Nature-Nurture using honey bee as model Genome technologies in wet lab and dry lab biology Localized Gene Expression for Normal Social Behavior Gene Robinson, Entomology (behavioral expressions) Susan Fahrbach, Entomology (anatomical localization) Sandra Rodriguez-Zas, Animal Sciences (data analysis) Interactive Information System for Functional Analysis Bruce Schatz, Library & Information Science (info systems) ChengXiang Zhai, Computer Science (text analysis) Chip Bruce, Library & Information Science (user support) INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Post-Genome Informatics Classical Organisms have extensive Genetic Descriptions There will be NO more classical organisms beyond Mice and Men other than Worms and Flies, Yeasts and Weeds. So must use comparative genomics to classical organisms, Via sequence homologies and literature analysis. Automatic annotation of genes to standard classifications, Such as Gene Ontology via sequence homology. Automatic analysis of functions to scientific literature, Such as concept spaces via text mining. Descriptions in Literature MUST be used for future interactive environments for functional analysis! INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Informational Science Computational Science is widely accepted as the Third Branch of Science (beyond Experimental and Theoretical) Genes are Computed, Proteins are Computed, Sequence “equivalences” are Computed. Informational Science is coming to be accepted as the Fourth Branch of Science Based on Information Science technologies for Functional Mining of Information Sources Comparative Analysis within the Dry Lab of Biological Knowledge INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Conceptual Navigation in BeeSpace Behavioral Biologist Bee Literature Molecular Biology Literature Brain Gene Expression Profiles Brain Region Localization Neuroscience Literature Neuroscientist INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Molecular Biologist Bee Genome Flybase, WormBase Biology: The Model Organism The Western Honey Bee, Apis mellifera has become a primary model for social behavior Complex social behavior in controllable urban environment Normal Behavior – honey bees live in the wild Controllable Environment – hives can be modified Small size manageable with current genomic technology Capture bees on-the-fly during normal behavior Record gene expressions for whole-brain or brain-region INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Informatics: From Bases to Spaces data Bases support genome data e.g. FlyBase has sequences and maps Genes annotated by GeneOntology and linked to literature BeeBase (Christine Elsik, Texas A&M) Uses computed homologies to annotate genes information Spaces support biomedical literature e.g. BeeSpace uses automatically generated conceptual relationships to navigate functions INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign BeeSpace Software Environment Will build a Concept Space of Biomedical Literature for Functional Analysis of Bee Genes -Partition Literature into Community Collections -Extract and Index Concepts within Collections -Navigate Concepts within Documents -Follow Links from Documents into Databases Locate Candidate Genes in Related Literatures then follow links into Genome Databases INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign BeeSpace Software Implementation Natural Language Processing Identify noun phrases Recognize biological entities Statistical Information Retrieval Compute statistical contexts Support conceptual navigation Network Information System Concept switch across community collections Semantic Links into biological databases INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign BeeSpace Information Sources Biomedical Literature - Medline (medicine) - Biosis (biology) - Agricola, CAB Abstracts, Agris (agriculture) Model Organisms (heredity) -Gene Descriptions (FlyBase, WormBase) Natural Histories (environment) -BeeKeeping Books (Cornell Library, Harvard Press) INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Worm Community System (1991) WCS Information Sources Literature Biosis, Medline, newsletters, meetings Data Genes, Maps, Sequences, strains, cells WCS Interactive Environment Browsing search, navigation Filtering selection, analysis Sharing linking, publishing WCS: 250 users at 50 labs across Internet (1991) Flagship in NSF National Collaboratory program INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign WCS Molecular INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign WCS Cellular INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign WCS PPCS demo INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Medical Concept Spaces (1998) Obtain discipline-scale collection Medline from NLM, 10M bibliographic abstracts human classification: Medical Subject Headings Partition discipline into Community Repositories 4 core terms per abstract for MeSH classification 32K nodes with core terms (classification tree) Community is all abstracts classified by core term 40M abstracts containing 280M concepts computation took 2 days on NCSA Origin 2000 Simulating World of Medical Communities 10K repositories with > 1K abstracts INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign (1K w/ > 10K) Navigation in MedSpace For a patient with Rheumatoid Arthritis Find a drug that reduces the pain (analgesic) but does not cause stomach (gastrointestinal) bleeding Choose Domain INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Concept Search INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Concept Navigation INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Retrieve Document INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Biomedical Session INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Categories and Concepts INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Concept Switching INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Document Retrieval INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Biological Concept Spaces (2005) Compute concept spaces for All of Biology BioSpace across entire biomedical literature 50M abstracts across 50K repositories Use Gene Ontology to partition literature into biological communities for functional analysis GO same scale as MeSH but adequate coverage? GO light on social behavior (biological process) INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Interactive Functional Analysis BeeSpace will enable users to navigate a uniform space of diverse databases and literature sources for hypothesis development and testing, with a software system that goes beyond a searchable database, using statistical literature analyses to discover functional relationships between genes and behavior. Genes to Behaviors Behaviors to Genes Concepts to Concepts Clusters to Clusters Navigation across Sources INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign BeeSpace Information Sources General for All Spaces: Scientific Literature -Medline, Biosis, Agricola, Agris, CAB Abstracts -partitioned by organisms and by functions Model Organisms -Gene Descriptions (FlyBase, WormBase, MGI, SCD, TAIR) Special Sources for BeeSpace: -Natural History Books (Cornell Library, Harvard Press) INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign XSpace Information Sources Organize Genome Databases (XBase) Compute Gene Descriptions from Model Organisms Partition Scientific Literature for Organism X Compute XSpace using Semantic Indexing Technology Boost the Functional Analysis from Special Sources Collecting Useful Data about Natural Histories e.g. CowSpace Leverage in AIPL Databases INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign Beyond BeeSpace The Analysis Environment technology is GENERAL! BirdSpace? BehaviorSpace? BrainSpace? SoySpace? CowSpace? IGBSpace? BioSpace Internet will evolve into Interspace… INSTITUTE FOR GENOMIC BIOLOGY University of Illinois at Urbana-Champaign