* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download From Functional Genomics to Physiological Model: the
Molecular ecology wikipedia , lookup
Gene nomenclature wikipedia , lookup
Mitochondrion wikipedia , lookup
Fatty acid synthesis wikipedia , lookup
Electron transport chain wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Mitochondrial replacement therapy wikipedia , lookup
Basal metabolic rate wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Metabolomics wikipedia , lookup
Citric acid cycle wikipedia , lookup
Oxidative phosphorylation wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Biochemistry wikipedia , lookup
Point mutation wikipedia , lookup
Evolution of metal ions in biological systems wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Gene regulatory network wikipedia , lookup
NADH:ubiquinone oxidoreductase (H+-translocating) wikipedia , lookup
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of Digital Biology, Mississippi State University From Functional Genomics to Physiological Model 1. 2. 3. 4. 5. A user’s guide to the Gene Ontology (GO) Finding GO for farm animal species Adding GO to your dataset GO based tools for biological modeling Examples: using GO for biological modeling • Presentation available at AgBase • Websites available as handout 1. A User’s Guide to GO What is the Gene Ontology? Emily Dimmer, GOA EBI: “a controlled vocabulary that can be applied to all organisms even as knowledge of gene and protein roles in cells is accumulating and changing” assign functions to gene products at different levels, depending on how much is known about a gene product is used for a diverse range of species structured to be queried at different levels, eg: find all the chicken gene products in the genome that are involved in signal transduction zoom in on all the receptor tyrosine kinases human readable GO function has a digital tag to allow computational analysis of large datasets GO Mapping Example NDUFAB1 (UniProt P52505) Bovine NADH dehydrogenase (ubiquinone) 1, alpha/beta subcomplex, 1, 8kDa Biological Process (BP or P) GO:0006633 fatty acid biosynthetic process TAS GO:0006120 mitochondrial electron transport, NADH to ubiquinone TAS GO:0008610 lipid biosynthetic process IEA NDUFAB1 GO:0005504 GO:0008137 GO:0016491 GO:0000036 Molecular Function (MF or F) fatty acid binding IDA NADH dehydrogenase (ubiquinone) activity TAS oxidoreductase activity TAS acyl carrier activity IEA Cellular Component (CC or C) GO:0005759 mitochondrial matrix IDA GO:0005747 mitochondrial respiratory chain complex I IDA GO:0005739 mitochondrion IEA GO Mapping Example NDUFAB1 (UniProt P52505) Bovine NADH dehydrogenase (ubiquinone) 1, alpha/beta subcomplex, 1, 8kDa GO:ID (unique) aspect or ontology Biological Process (BP or P) GO:0006633 fatty acid biosynthetic process TAS GO:0006120 mitochondrial electron transport, NADH to ubiquinone TAS GO:0008610 lipid biosynthetic process IEA NDUFAB1 GO term name GO:0005504 GO:0008137 GO:0016491 GO:0000036 Molecular Function (MF or F) fatty acid binding IDA NADH dehydrogenase (ubiquinone) activity TAS oxidoreductase activity TAS acyl carrier activity IEA Cellular Component (CC or C) GO:0005759 mitochondrial matrix IDA code GO evidence GO:0005747 mitochondrial respiratory chain complex I IDA GO:0005739 mitochondrion IEA GO EVIDENCE CODES Direct Evidence Codes GO Mapping IDA - inferred fromExample direct assay IEP - inferred(UniProt from expression NDUFAB1 P52505)pattern IGIBovine - inferred fromdehydrogenase genetic interaction NADH (ubiquinone) 1, alpha/beta subcomplex, 1, 8kDa IMP - inferred from mutant phenotype IPI - inferred from physical interaction Biological Process (BP or P) GO:0006633 fatty acid biosynthetic process TAS Indirect Evidence Codes GO:0006120 mitochondrial electron transport, NADH to ubiquinone TAS inferred from literature GO:0008610 lipid biosynthetic process IEA IGC - inferred from genomic context TAS - traceable author statement Molecular Function (MF or F) NAS - non-traceable author statement GO:0005504 fatty acid binding IDA IC - inferred by curator GO:0008137 NADH dehydrogenase (ubiquinone) activity TAS inferred by computational analysis GO:0016491 oxidoreductase activity TAS NDUFAB1 RCA - inferred from reviewed GO:0000036 computational acylanalysis carrier activity IEA ISS - inferred from sequence or structural similarity IEA - inferred from electronic annotation Cellular Component (CC or C) GO:0005759 mitochondrial matrix IDA Other GO:0005747 mitochondrial respiratory chain complex I IDA NR - not recorded (historical) GO:0005739 mitochondrion IEA ND - no biological data available Unknown Function vs No GO ND – no data Biocurators have tried to add GO but there is no functional data available Previously: “process_unknown”, “function_unknown”, “component_unknown” Now: “biological process”, “molecular function”, “cellular component” No annotations (including no “ND”): biocurators have not annotated 2. Finding GO for Farm Animals GO Browsers QuickGO Browser (EBI GOA Project) http://www.ebi.ac.uk/ego/ Can search by GO Term or by UniProt ID Includes IEA annotations AmiGO Browser (GO Consortium Project) http://amigo.geneontology.org/cgi-bin/amigo/go.cgi Can search by GO Term or by UniProt ID Does not include IEA annotations Getting GO http://www.ebi.ac.uk/GOA/downloads.html includes farm animals Getting GO http://www.geneontology.org/GO.current.annotations.shtml#f ilter Getting GO http://www.agbase.msstate.edu/ 3. Adding GO to your dataset GO analysis of array data Probe data is linked to gene product data gene, cDNA, ESTs IDs For some arrays, gene product data has corresponding GO data available Not all gene products will have GO annotation will from vendor (updated?) not be included in modeling Need to get the maximum amount of GO data to do biological modeling Example: Netaffx Secondary source of GO annotation GORetriever + many more GORetriever GORetriever Results GORetriever Results GORetriever Results save as text file For GOSlimViewer GORetriever Results But what about IDs not supported by GORetriever? GOanna GOanna Results query IDs are hyperlinked to BLAST data (files must be in the same directory) If there is a good alignment* to a protein with GO transfer GO to your record If there is not a good alignment or the record doesn’t have GO literature *WHAT IS A GOOD ALIGNMENT? good alignment add to GO summary file (tab-delimited text file containing ID, GO:ID, aspect) Contact AgBase to request GO annotation of specific gene products. GOSlimViewer: summarizing results GOSlimViewer results response to stimulus amino acid and derivative metabolic process transport behavior cell differentiation metabolic process regulation of biological process cell communication nucleobase, nucleoside, nucleotide and nucleic acid metabolic process cell death cell motility macromolecule metabolic process multicellular organismal development catabolic process biological_process response to stimulus amino acid and derivative metabolic process transport behavior cell differentiation metabolic process regulation of biological process cell communication nucleobase, nucleoside, nucleotide and nucleic acid metabolic process cell death ?? cell motility macromolecule metabolic process multicellular organismal development catabolic process biological_process “process unknown” “function unknown” “component unknown” Looking at function, not genes Pie Graphs – relative proportions B-cells apoptosis immune response cell-cell signaling Stroma GOModeler: quantitative, hypothesis-driven modeling. Coming soon (contact AgBase) GOModeler McCarthy et al “AgBase: a functional genomics resource for agriculture.” BMC Genomics. 2006 Sep 8;7:229. 4. GO based tools for biological modeling http://www.geneontology.org/ However…. many of these tools do not support farm animal species the tools have different computing requirements may be difficult to determine how up-to-date the GO annotations are… Need to evaluate tools for your system. Evaluating GO tools Some criteria for evaluating GO Tools: 1. Does it include my species of interest (or do I have to “humanize” my list)? 2. What does it require to set up (computer usage/online) 3. What was the source for the GO (primary or secondary) and when was it last updated? 4. Does it report the GO evidence codes (and is IEA included)? 5. Does it report which of my gene products has no GO? 6. Does it report both over/under represented GO groups and how does it evaluate this? 7. Does it allow me to add my own GO annotations? 8. Does it represent my results in a way that facilitates discovery? 5. Using GO for biological modeling Using GO for biological modeling: hypothesis generating hypothesis driven