* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download NCBI - Alumni Medical Library
Minimal genome wikipedia , lookup
Non-coding DNA wikipedia , lookup
Epitranscriptome wikipedia , lookup
Genomic imprinting wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Human genome wikipedia , lookup
Copy-number variation wikipedia , lookup
Genetic engineering wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Epigenetics of human development wikipedia , lookup
History of genetic engineering wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Genomic library wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene therapy wikipedia , lookup
Point mutation wikipedia , lookup
Genome (book) wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Protein moonlighting wikipedia , lookup
Public health genomics wikipedia , lookup
Gene desert wikipedia , lookup
Genome editing wikipedia , lookup
Gene expression programming wikipedia , lookup
Pathogenomics wikipedia , lookup
Metagenomics wikipedia , lookup
Gene expression profiling wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gene nomenclature wikipedia , lookup
Microevolution wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Designer baby wikipedia , lookup
Genome evolution wikipedia , lookup
Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library Alumni Medical Library • Location − 12th Floor Instructional Bldg − www.medlib.bu.edu • Services − Electronic resources: full text access through PubMed, Google Scholar, Web of Science −Reference: drop in or by reservation − Instruction: request class sessions or creation of web tutorial - Learning resource center: lab space, hands-on instruction NCBI • National Center for Biotechnology Information • Built on Entrez System • Original database was Nucleotide • PubMed built upon this original structure. • PubMed, GENE, other molecular databases interconnected • Gene discovery, related data options in PubMed • MyNCBI works with multiple databases GENE • Gives sequence, expression, information about protein structure and function. • Doesn't list all known and predicted genes • Focuses on completely sequenced genomes or ones where research communities are actively contributing genetic information. • Information from RefSeq and collaborating model organism databases. • Mix of curated and automatically updated information. •Pulls in, links out to resources outside of NCBI. •4.6 Million records for 5,588 taxa GENE Record •Summary official full name, gene type, lineage, summary, AKA •Genomic regions, transcripts – structure, exon-intron boundaries. − Gene table for fuller display. • Bibliography: GeneRIF. − Summary of gene functions with specific references to related articles about function of gene/proteins in PubMed. Put together by people at NCBI. − Not comprehensive, but will give you the most relevant papers regarding function. − Authors can contact the NCBI to submit their citations RefSeq • Reference Sequences − Nucleotide sequences and protein translation − Curated by NCBI or NCBI-approved programs. • Difference between GenBank and RefSeq − GenBank has raw data and duplicated records − Metadata in GenBank can be incomplete − RefSeq annotated, curated and non-redundant. − NCBI takes best sequences from GenBank and curates for RefSeq records RefSeq Record Numbers mRNAs and Proteins NM_123456 NP_123456 NR_123456 XM_123456 XP_123456 XR_123456 Gene Records NG_123456 Chromosome NC_123455 AC-123455 Assemblies NT_123456 NW_123456 Curated mRNA Curated Protein Curated non-coding RNA Predicted mRNA Predicted Protein Predicted non-coding RNA Reference Genomic Sequence Microbial replicons, organelle genomes, human chromosomes Alternate assemblies Contig WGS Supercontig OMIM • Online Mendelian Inheritance in Man • Previously in print, 10 volumes, updated every 2 years. • Contains all the known genes in humans. • Gives referenced explanations of cloning, allelic variations, inheritance, mapping, molecular genetics • Links to clinical and testing information • OMIA (Online Mendelian Inheritance in Animals) a separate database for information in animals. Databases for Evidence • GEO Profiles: Microarray Data Repository public repository - Archives and freely distributes microarray, next-generation sequencing, and other highthroughput functional genomic data. - Submitted by researchers. Offers data storage, web-based interfaces and applications to query and download content • Evidence Viewer: Graphical display of evidence supporting a gene model Genome • Sequence and map data from the whole genomes of over 1000 organisms -Represent organisms that are completely sequenced and those that are in progress. • Graphical overviews of complete genomes/chromosomes • Specialized genome BLAST search to see alignments in context of genome • Good for microbial genomes. Homologene • May want to use instead of BLAST if looking for a model organism with same function or if looking at an evolutionary comparison. • Allows downloads of genomic information. - Can capture regulatory region by including bases up or down stream. • Multiple and pairwise alignment • Protein Alignment scores - Substitution rates, synonymous vs. non, conservative vs. radical • Polymorphisms in GeneView dbSNP link Structure and Models • Structure, MMDB (Molecular Modeling Database) -Access from Protein link, Related Structure • CN3D for application to view at different angles, highlight sequence in structure. • VAST (Vector Alignment Search Tool) searches by geometric criteria BLink • BLAST Link - Pre-run BLAST results - NCBI runs weekly searches for every new protein sequence. • Can use instead of running BLAST search - More information than in default BLAST: taxonomy report, view multiple alignments, search data against different Links to Outside Databases • MGI • Ensembl • KEGG: Kyoto encyclopedia of genes and genomes - Integrated databases - Pathway, disease, drug - Good for quick pathway and protein graphics •UCSC Genome Browser -Visualize tracks to compare information like gene predictions, ESTs, conserved regions. - BLAT Blast-like alignment tool – quicker but not as sensitive as BLAST. Gene Information from GO • Gene expression information from Gene Ontology (GO) - Lists what has been assigned to the gene in: Molecular Function Biological Processes Cellular Component • Level of evidence and references linked when available. • Links into AMIGO browser for more ontology or evidence information •Can search GENE for GO information by placing suffix at end of search Ex: “vasodilation [GO]” BU Resources • Biostatistics - Dr.Mayetri Gupta: created statistical software for discovering transcription factor binding sites (motifs) and regulatory modules, gene regulatory networks, and phylogenetic inference. - Dr. Paola Sebastiani: created software for network modeling called Bayesware Discoverer, also CAGED, BAGED for analysis of gene expression data. Library Support • Contact the library with any suggestions, recommendations that we can list or promote for BU community • Software and datasets can be archived in BU’s Digital Common • If there are resources we don’t have, we may be able to procure them for you. • Hands-on BLAST workshop offered.