* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Title
Survey
Document related concepts
Proteolysis wikipedia , lookup
Interactome wikipedia , lookup
Community fingerprinting wikipedia , lookup
Gene desert wikipedia , lookup
Magnesium transporter wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Expression vector wikipedia , lookup
Point mutation wikipedia , lookup
Gene expression wikipedia , lookup
Gene regulatory network wikipedia , lookup
Gene nomenclature wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Transcript
Professional Development Course 1 – Molecular Medicine Gene/Protein Knowledge Bases June 14, 2012 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services Health Sciences Library System University of Pittsburgh [email protected] http://www.hsls.pitt.edu/guides/genetics Objectives •Gene-centric information gateways •Protein-centric information hubs http://www.hsls.pitt.edu/moblio Topics Gold Standards: •NCBI Gene •EBI Uniprot Noteworthy Databases •UCSC Gene Detail Page •GeneCards Commercial –HSLS Licensed Knowledge Bases •NextBio •BioBase Knowledge Library •Ingenuity IPA •Metacore http://www.hsls.pitt.edu/molbio Gene/Protein Information Chromosomal location, mRNA, genomic seq, orthologs, paralogs, regulatory elements, Amino acid seq, domain architecture, protein structure, post translational modifications Gene expression, biological pathways, protein interaction map, disease association, biomarkers http://www.hsls.pitt.edu/molbio Common gene questions What diseases are associated with it? What is its function? Which tissues express it? What are its neighboring genes? What is its genomic seq? How many splice variants are there? What are its intron-exon architecture? http://www.hsls.pitt.edu/molbio How can I get its cDNA clone? Common protein questions What is its Function? Amino acid sequence? … molecular wt? isoelectric point (PI)? …post translational modifications? … presence of domain/pattern/profile? … hydrophobicity? … homologous orthologs? Etc. Structure? … secondary and tertiary? Interaction Partner? http://www.hsls.pitt.edu/molbio Bioinformatics Databases & Software Providers National Center for Biotechnology Information (NCBI) Home page Site map Resource Guide European Bioinformatics Institute (EBI) Home page Databases Software http://www.hsls.pitt.edu/molbio Entrez Gene each record represents a single gene from a given organism Statistics Gene: 7974 organisms Genbank: 160,000 organisms http://www.hsls.pitt.edu/molbio NCBI : Entrez Gene Chromosomal Localization Amino acid Genomic mRNA Sequence Sequence Sequence Homologous Sequences Expression Profile Disease 3D Structure SNP http://www.hsls.pitt.edu/molbio Interacting Partners Entrez Gene Find: gene symbols and aliases sequences: genomic, mRNA, protein intron-exon architecture genomic context: neighboring and antisense genes Interacting partners associated gene ontology terms: function, cellular component and biological process http://www.hsls.pitt.edu/molbio Sequence Information Find sequence information for a gene -genomic -mRNA -promoter -protein - intron-exon coordinates Resources NCBI Entrez Gene: http://www.ncbi.nlm.nih.gov/gene Link to the video tutorial: http://media.hsls.pitt.edu/media/clres2705/sequence.swf http://media.hsls.pitt.edu/media/clres2705/sequence_2.swf http://www.hsls.pitt.edu/molbio NCBI Sequence Databases GenBank GenPept archival database of nucleotide sequences from >160,000 organisms More info conceptual translation of GenBank CDS Refseq based on GenBank record, non-redundant expert verified databases of reference sequences http://www.hsls.pitt.edu/molbio International Nucleotide Sequence Database Collaboration http://www.hsls.pitt.edu/molbio Primary Vs Derivative databases http://www.hsls.pitt.edu/molbio RefSeq Scope & Accessions Genomic DNA NC_123456 - complete genome, complete chromosome, complete plasmid NG_123456 - genomic region NT_123456 - genomic contig mRNA - NM_123456 Protein - NP_123456 more about RefSeq scope and accessions... http://www.hsls.pitt.edu/molbio UniProt: Protein Knowledge Base UniprotKB Universal Protein Resource : a comprehensive, centralized protein information resource Developed by a consortium: European Bioinformatics Institute (EBI) the Swiss Institute of Bioinformatics (SIB) the Protein Information Resource (PIR) Comprised of: --Swiss-Prot: biologist-curated annotation data --TrEMBL: computationally annotation data --PIR-International Protein Sequence Database (PIR-PSD): Funded by: NIH, NSF, the European Union and the Swiss Federal government Tutorial Video: http://www.youtube.com/watch?v=TCF3qWn7siI&feature=youtube_gdata http://www.hsls.pitt.edu/molbio PROTEIN sequences, domains, post-translational modifications & structures Start with a protein sequence and find the following: -domains -post translational modifications -secondary Structures -calculated molecular wt and isoelectric point -hydrophobicity plot -peptide digestion Resources Uniprot: http://www.uniprot.org/ Link to the video tutorial: http://media.hsls.pitt.edu/media/clres2705/uniprot.swf http://media.hsls.pitt.edu/media/clres2705/uniprot2.swf http://www.hsls.pitt.edu/molbio Protein Domains Wikipedia: A protein domain is a part of protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain. Each domain forms a compact three-dimensional structure and often can be independently stable and folded. Many proteins consist of several structural domains. One domain may appear in a variety of evolutionarily related proteins. Domains vary in length from between about 25 amino acids up to 500 amino acids in length. The shortest domains such as zinc fingers are stabilized by metal ions or disulfide bridges. Domains often form functional units, such as the calcium-binding EF hand domain of calmodulin. http://www.hsls.pitt.edu/molbio Protein Domain: SH3 Src homology 3 domains; SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. http://www.hsls.pitt.edu/molbio Homologous Sequences Homologene What are its homologous genes? Entrez Gene Page: Link Homologene change Display settings http://www.hsls.pitt.edu/molbio Find homologous sequence information of a gene -What % identity a human gene shares with its mouse homologue Resources NCBI Entrez Gene: http://www.ncbi.nlm.nih.gov/gene Link to the video tutorial: http://media.hsls.pitt.edu/media/clres2705/homologene.swf http://www.hsls.pitt.edu/molbio NCBI Entrez Gene http://www.hsls.pitt.edu/molbio Published Probe Sequences Retrieve probe sequences published in literature for a gene •gene silencing (siRNA) •realtime PCR •genotyping Resources NCBI Probe Database: http://www.ncbi.nlm.nih.gov/probe Ingenuity IPA: Link to the video tutorial: http://media.hsls.pitt.edu/media/clres2705/probe.swf http://www.hsls.pitt.edu/molbio Functions GeneOntology (GO) http://www.geneontology.org/ http://www.hsls.pitt.edu/molbio Levels of abstraction Gene Ontology (GO) Khatri, P. et al. Bioinformatics 2005 21:3587-3595; doi:10.1093/bioinformatics/bti565 Copyright restrictions may apply. http://www.hsls.pitt.edu/molbio General gene / protein Information functions, mutations, disease associations, biomarkers & drug interactions HSLS Licensed Databases Metacore from GeneGO BKL from Biobase portal.genego.com http://goo.gl/9wpwG NextBio http://goo.gl/bpuUC http://www.hsls.pitt.edu/molbio Metacore: Search and Browse .. retrieve information for your gene of interest ….find drugs inhibiting kinases involved in apoptosis … find common genes for breast cancer and colorectal cancer …. find common target for erlotinib and gefintinib Resource Metacore: http://portal.genego.com/ Link to the video tutorials: http://media.hsls.pitt.edu/media/molbiovideos/metacore1.swf http://media.hsls.pitt.edu/media/molbiovideos/metacore2.swf http://www.hsls.pitt.edu/molbio Gene Expression Information Retrieve gene expression information What is the expression level of EGFR in human liver tissue Or in HeLa cell line Or in colon cancer? - Resources EBI Gene Expression Atlas: http://www.ebi.ac.uk/gxa/ BioGPS: http://biogps.gnf.org/#goto=welcome GeneCards: http://www.genecards.org Link to the video tutorial: http://media.hsls.pitt.edu/media/clres2705/expression.swf http://www.hsls.pitt.edu/molbio GeneCards http://www.genecards.org/ http://www.hsls.pitt.edu/molbio Protein Interactions & Biological Pathways Signaling Pathway Map http://www.hsls.pitt.edu/molbio Biological Pathways & PPI Databases BioGrid: http://thebiogrid.org/ STRING: http://string-db.org/ http://www.hsls.pitt.edu/molbio - Retrieve interacting partners of a protein of your interest -What proteins interact with human EGFR? Resources BioGrid: http://thebiogrid.org/ STRING: http://string-db.org/ Link to the video tutorial: http://media.hsls.pitt.edu/media/molbiovideos/ppi.swf http://www.hsls.pitt.edu/molbio BioGrid http://www.hsls.pitt.edu/molbio BioGrid Search Result Page http://www.hsls.pitt.edu/molbio Thank you! Any questions? Carrie Iwema [email protected] 412-383-6887 Ansuman Chattopadhyay [email protected] 412-648-1297 http://www.hsls.pitt.edu/molbio