Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Selection of Resources for the Development of an Information Service Program in Molecular Biology and Genetics Ansuman Chattopadhyay, PhD Information Specialist in Molecular Biology and Genetics Health Sciences Library System University of Pittsburgh Topics Multi Step Life Sciences Research  Literature Retrieval  Sequence Analysis  Laboratory Resources  University of Pittsburgh HSLS Molecular Biology Information Service Program  Life Sciences Research- A Multi Step Process Hypothesis Generation Knowledge Mining Sequence Analysis Mol Biol Information Service Laboratory Bench Work Literature Retrieval Resources Hypothesis Generation Knowledge Mining Sequence Analysis Laboratory Bench Work PubMed --CellSpace Knowledge Miner --PubGene --Genomatix BiblioSphere Too much information 83,130 31,596 Literature Retrieval Resources www.cellomics.cellspace.com http://www.pubgene.org http://www.genomatix.de/ What is CellSpace ? www.cellomics.cellspace.com CellSpace is a bioinformatics tool-- a knowledge mining system that automatically detects, analyzes, and reports the logical relationships between four types of terms found in the research literature: 1. 2. 3. 4. molecule: proteins, genes, drugs function: biological processes and disease states cell type organism CellSpace Knowledge Miner What is CellSpace ? Literature Association Molecules + + + + Functions + + + + Cells & Systems + + _ _ Organisms + + _ _ Cells & Systems: Cells, Sub-cellular Components,Tissues and Organs Molecules: • Molecules •Drugs •Genes •Proteins Functions: • Biological Functions • Disease States What you can do with CellSpace? •Start with a single protein (or other molecule) and find its functions, the diseases in which it is implicated, and related molecules. •Start with a disease or biological function and find related molecules, or related functions. •Start with two or more functions, and find the related molecules that they have in common What you can do with CellSpace? •Start with results from a high-throughput experiment (such as a cluster of co-regulated genes from microarray analysis), and easily find the functions that they share. • Start with the results of proteomics experiments, and quickly screen the data to distinguish published interactions from novel ones. .View the literature that supports the connections found in CellSpace. CellSpace Knowledge Miner Start with a disease or biological function and find related molecules, or related functions •Find molecules related to apoptosis 5 1 2 3 Drag and drop 4 Click to select Find molecules associated with “apoptosis” Get references Results are presented with statistical likelihood value CellSpace Knowledge Miner How CellSpace Works? CellSpace computers analyze the National Library of Medicine's MEDLINE database, performing proprietary statistical correlation analyses regarding the organisms, cell types, biological processes, and molecules reported in 655 selected life science research journals. The molecular relationships extracted from the literature are then stored in the CellSpace database, which can be queried via the CellSpace user interface. The information is updated every two weeks PubGene The Network Browser tool displays literature association networks for a gene. The Set Cover Article Search tool will let you search the literature using a set covering algorithm. The set covering algorithm is particularly useful to search for literature references for large sets of terms. PubGene PubGeneThe query gene is shown with bright red font in the graph, its direct neighbors are shown with darker red font, and neighbours of neighbours are shown with black font BiblioSphere BiblioSphere BiblioSphere BiblioSphere BiblioSphere BiblioSphere Resources comparison Availability Coverage Update frequency CellSpace Commercial 2 weeks free trial Every 2 weeks PubGene V2.1 free V2.3 commercial BiblioSphere 20 use/month free 655 Medline journals All Medline Journals SP: H,M,R V2.1-once in a year V2.3- every 2 weeks All Medline continuous Journals Abstract only SP: H,M,R Resources comparison Search Terms CellSpace Mol: gene, protein, Drugs Func: Biological func, Disease state, Cell and tissue type, PubGene Gene name Bibliosphere Gene name Information Hubs Hypothesis Generation Knowledge Mining Sequence Analysis Laboratory Bench Work The molecular biology and genetics resources that can serve as information hubs, an access point to retrieve a broad range of information through a small number of selected web-based public databases Information Hubs •UCSC Genome Bioinformatics Resources Gene’s detail page Genome Browser Family Browser Proteome Browser •SwissProt •LocusLink / Entrez Gene •Gene Cards •Gene Lynx •Incyte Proteome Bioknowledge Library •Human Protein Reference Database •Organism Genome Consortium sites Information Hubs Gene Expression Data UCSC Family browser LocusLink RNA Structure SwissProt OMIM GeneCards Other Species CGAP UCSC Gene’s Detail Page GeneLynx PubMed AceView UCSC genome browser Mouse Genome Informatics UCSC Proteome browser Sequence Genomic,mRNA Protein Protein Structure GO Annotations Molecular function Bio pathways Cellular component Information Hubs http://genome.ucsc.edu/cgi-bin/hgGene?hgsid=31408663&db=hg16&hgg_gene=U14680&hgg_chrom=chr17&hgg_start=41570859&hgg_end=41650551 Information Hubs http://www.ncbi.nlm.nih.gov/LocusLink/LocRpt.cgi?l=672 Information Hubs Information Hubs http://bioinfo.weizmann.ac.il/cards-bin/carddisp?BRCA1&search=BRCA1&suff=txt Information Hubs http://www.hprd.org/protein/00218 Information Hubs Information Hubs Sequence Expression in Organ/Tissue Cell Type Tumor Type Protein Interactions Literature Excerpts Disease Proteome BioKnowledge Library Gene Ontology terms Gene Regulation Protein Modifications Resources Comparison Availability Type SP Coverage Noteworthy Features H,M,R, etc Expression, Proteome/Fam ily Browser ALL Protein information UCSC Free SwissProt/ uniprot Free LocusLink Free H,M,R,N,P etc Link to NCBI resources GeneCards Free H Expression GeneLynx Free H,M,R Proteome BKL Commercial Curated H,M,R,Y,N, Pathogenic Fungi Literature excerpts HPRD Free Curated H Protein interaction Curated Genome Browsers : Molecular Database Catalog http://nar.oupjournals.org/ •Nucleic Acids Research Database Issue Growth of Molecular databases 600 500 400 Articles 300 Databases 200 100 0 1996 1997 1998 1999 2000 2001 2002 2003 2004 Database Catalog http://www.infobiogen.fr/services/dbcat/ Sequence Analysis Hypothesis Generation Knowledge Mining Sequence Search Sequence Alignment Sequence Analysis Laboratory Bench Work MolBiol Tools: Restriction mapping, PCR primer design Sequence Manipulation Web Server Catalog http://nar.oupjournals.org/ •Nucleic Acids Research Database Issue Sequence Analysis http://www.bioinformatics.vg/ http://healthlinks.washington.edu/index.cfm?id=210BCCB7-511A-4C6B-8B40-DFC47AABEA7F http://www.hsls.pitt.edu/guides/genetics Sequence Analysis http://www.bioinformatics.vg/ Sequence Analysis Sequence Analysis Sequence Analysis Sequence Analysis DNAStar LaserGene PC/Mac PC/Mac Sequence Analysis Vector NTI Database DNA/RNA Protein Oligo Enzyme Gel Marker Blast Result Analysis Result Software Vector NTI core AlignX ContigExpress GenomBench BioAnnotator Sequence Analysis VectorNTI Advanced software suit consists of five independent yet interconnected components: •Vector NTI core: the cornerstone application for Vector NTI suite, provides tools for sequence analysis and molecule manipulation. •AlignX: a multiple sequence alignment tool •ContigExpress: a DNA sequence assembly and sequencing project management tool •GenomBench: a tool for genomic DNA sequence analysis and annotation •BioAnnotator: a tool for functional annotation of DNAs and proteins Sequence Analysis Using vector NTI molecular biologists can: •Perform routine sequence analysis tasks such as restriction mapping, identifying protein coding regions or finding sequence motifs and carrying out sequence similarity searches •Generate recombinant cloning strategies and protocols •Design and analyze PCR primers •Catalog a growing number of plasmids and PCR primers, in order to track the origin and lineage of recombinant molecules •Run in silico gel electrophoresis •Perform and edit multiple sequence alignments on proteins and nucleic acids •Create publication quality graphics and more Laboratory Resources Hypothesis generation Knowledge Mining Protocols: Useful Laboratory Resources: Sequence Analysis Laboratory Bench work Laboratory Resources http://www.interscience.wiley.com/c_p/index.htm Basic Protocol Alternate Protocol Commentary Critical Parameters Troubleshooting Time Considerations Key References Internet Resources Laboratory Resources http://researchlink.labvelocity.com/ HSLS Mol Biol Information Service HSLS Mol Biol Information Service http://www.hsls.pitt.edu/guides/genetics Website Usage Report http://www.hsls.pitt.edu/guides/genetics Workshops May 2003-April 2004 45 40 35 30 # Times Offered 25 20 # Workshop Attendees 15 10 5 0 1 2 3 4 5 Workshop 1: Information Hubs 2: Sequence Similarity Searching 3: DNA Protein Analysis Tools 4: CellSpace Knowledge Miner 5: VectorNTI One-on-one Consultation 14 12 10 8 Number of Consultations 6 4 Total: 70 2 2003 ay M ar M Ja n N ov Se pt y Ju l M ay 0 2004 “…..only half of biomedical researchers using genome databases are familiar with the tools that can be used to actually access the data.” “….. all scientists on the planet must be empowered to use these powerful databases to unravel longstanding scientific mysteries.” atabases to unravel longstanding scientifi c… Andreas D. Baxevanis & Francis S. Collins Nature Genetics, September 2002, Vol 32