Download 2002-09_GO_annotation_JL

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Epigenetics of human development wikipedia , lookup

Transposable element wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Point mutation wikipedia , lookup

Pathogenomics wikipedia , lookup

NEDD9 wikipedia , lookup

Genetic engineering wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Public health genomics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Copy-number variation wikipedia , lookup

Gene wikipedia , lookup

Genomics wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Genome (book) wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

RNA-Seq wikipedia , lookup

The Selfish Gene wikipedia , lookup

Gene therapy wikipedia , lookup

Gene expression programming wikipedia , lookup

Genome editing wikipedia , lookup

Gene desert wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genome evolution wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Helitron (biology) wikipedia , lookup

Microevolution wikipedia , lookup

Gene nomenclature wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Designer baby wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Annotating with GO: an overview
http://www.geneontology.org/
What is a Gene Ontology (GO) annotation?
Databases external to GO make cross-links between GO terms and objects in their databases (typically, gene
products, or their surrogates, genes), and then provide tables of these links to GO. The GO itself contains no
information about genes or gene products. The GO annotation (‘gene association’) files are all publicly
available:
Database name abbreviation
A gene product is annotated to one or
more terms in each of the three
ontologies; biological process, cellular
component and molecular function.
http://www.geneontology.org/#annotations
Abbreviations used by GO are described here:
http://www.geneontology.org/doc/GO.xrf_abbs
Gene products are annotated to the most specific GO
term possible for the information available.
Example annotation:
A gene product is annotated
Database Object identifier. A Database
Object is usually a gene product, but can
also be a gene or a transcript.
Used when it is specified in the
source that that a gene product is
NOT associated with a particular
gene product e.g. “we have found
that protein Z is not involved in the X
cascade”.
DB
DB_Object_ID
DB_Object_
Symbol
SGD
S0000296
PHO3
SGD
S0000296
PHO3
[NOT]
GO:0015888
Κ
Κ
Κ
Κ
DB:Reference
(|DB:Reference)
go_id
GO:0003993
SGD:8789|PMID:267
6709
SGD:8789|PMID:267
6709
Evidence
With
IMP
IMP
Aspect
DB_Object_Name
(|Name)
P
ΚΚ
Κ
Κ
F
Κ
Κ
Κ
Κ
DB_Object_Synonym
(|Synonym)
DB_Object_
Type
Taxo n
(|taxon)
Date
YBR092C
gene
taxon:4932
20001122
YBR092C
gene
taxon:4932
20001122
Fields highlighted in grey are mandatory
Gene Ontology term identifier
with terms reflecting only its normal
activities, locations and processes.
When there is no information regarding one or more
aspects of a gene product, the gene product is annotated
to the GO term ‘unknown’.
Object type: gene, transcript or
protein
Annotation of a gene product to one ontology is
independent of its annotation to the other two
ontologies.
The
annotation
of
P = biological process, F =
Taxonomic identifier for gene
molecular function and C = cellular
gene products to GO
product
component.
terms is performed according to
two main principles: the recording of the
source of the annotation and the type
of evidence on which
the annotation was
based.
The source of an annotation may be a literature reference, a
The evidence describes how the annotation was created, and
database record or the type of computational anaylsis. Literature
provides a way of measuring its strength or reliability. GO has
references are entered as an accession number, either from the
developed a set of standard evidence codes which form a loose
database in question and/or from PubMed. Annotations based on
hierarchy, with ‘inferred by electronic annotation’ (IEA) being the
computational analysis include a reference to the method of
least reliable type of evidence, followed by ‘inferred by sequence
analysis.
similarity’ (ISS).
Evidence codes
IDA
inferred from direct assay
IC
IEP
inferred from expression pattern
IMP inferred from mutant phenotype
IEA
inferred from electronic annotation
IGI
inferred from genetic interaction
TAS
traceable author statement
IPI
inferred from physical interaction
NAS non-traceable author statement
ISS
inferred from sequence similarity
ND
inferred by curator
no biological data available
Collaborating databases
Many important databases produce GO annotations and contribute to the development of the GO. These include:
FlyBase (database for the fruitfly Drosophila melanogaster), Berkeley Drosophila Genome Project (Drosophila informatics; GO database & software), Saccharomyces Genome Database (SGD) (database for the budding yeast Saccharomyces cerevisiae),
Mouse Genome Database (MGD) & Gene Expression Database (GXD) (databases for the mouse Mus musculus), The Arabidopsis Information Resource (TAIR) (database for the brassica family plant Arabidopsis thaliana), WormBase (database for the
nematode Caenorhabditis elegans), PomBase (database for the fission yeast Schizosaccharomyces pombe), Rat Genome Database (RGD) (database for the rat Rattus norvegicus), DictyBase (informatics resource for the slime mold Dictyostelium discoideum),
The Pathogen Sequencing Unit (The Wellcome Trust Sanger Institute), Genome Knowledge Base (GKB) (Cold Spring Harbor Laboratory), EBI : InterPro - SWISS-PROT - TrEMBL groups, The Institute for Genomic Research (TIGR), Gramene (A
Comparative Mapping Resource for Monocots), Compugen (with its Internet Research Engine).