Download Amsterdam 2004

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cell-penetrating peptide wikipedia , lookup

Community fingerprinting wikipedia , lookup

History of molecular evolution wikipedia , lookup

RNA silencing wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Protein moonlighting wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Real-time polymerase chain reaction wikipedia , lookup

Epitranscriptome wikipedia , lookup

Molecular cloning wikipedia , lookup

Point mutation wikipedia , lookup

Gene regulatory network wikipedia , lookup

Non-coding RNA wikipedia , lookup

Gene wikipedia , lookup

Non-coding DNA wikipedia , lookup

RNA-Seq wikipedia , lookup

Eukaryotic transcription wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Replisome wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Gene expression wikipedia , lookup

Molecular evolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

List of types of proteins wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Transcript
Formalizations of Function & Literature
Databases
Protein function
prediction
• What is function ?
• Various levels of
description
What is function?
• Contextual / philosophical point
• operational dichotomy I often
use: biochemical function vs
biological role:
– Enolase (2-phospho-Dglycerate hydrolase)
catalyses the
interconversion of 2phosphoglycerate and
phosphoenolpyruvate
– Part of the glycolysis
pathway
But ..
• α-enolase in addition functions as a lens structural
protein, τ-crystallin in ducks
• protein multi-functionality
• Molecular function? Nice crystallization and
refractory properties?
Gene Ontology
• Historically: nothing … except swissprot
keywords and specific systems for metabolic
enzymes
• This is somewhat problematic for automated
gene function prediction (e.g. blast and/or coexpression) and for the study of the evolution of
gene function.
• Despite everything that we know as written down
in the public literature !?
• One (example) solution: Gene Ontology
Gene Ontology
• computer science: an ontology is a data model
that represents a domain and is used to reason
about the objects in that domain and the relations
between them.
– GO:0008150 : biological_process
– GO:0005575 : cellular_component
– GO:0003674 : molecular_function
Gene Ontology: Molecular function
• Molecular function describes activities, such as catalytic or
binding activities, at the molecular level. GO molecular
function terms represent activities rather than the entities
(molecules or complexes) that perform the actions, and do
not specify where or when, or in what context, the action
takes place. Molecular functions generally correspond to
activities that can be performed by individual gene
products, but some activities are performed by assembled
complexes of gene products. Examples of broad
functional terms are catalytic activity, transporter
activity, or binding; examples of narrower functional
terms are adenylate cyclase activity or Toll receptor
binding.
•
•
•
•
•
•
•
•
•
DNA-directed DNA polymerase activity
Accession: GO:0003887
Ontology: molecular_function
Synonyms: alt_id: GO:0003888
Definition:
– Catalysis of the reaction: deoxynucleoside triphosphate + DNA(n) =
diphosphate + DNA(n+1); the synthesis of DNA from deoxyribonucleotide
triphosphates in the presence of a DNA template or primer.
Comment: None
Term Lineage
Graphical View
all : all ( 228266 )
– GO:0003674 : molecular_function ( 172339 )
• GO:0003824 : catalytic activity ( 68591 )
– GO:0016740 : transferase activity ( 22363 )
» GO:0016772 : transferase activity, transferring phosphoruscontaining groups ( 13535 )
» GO:0016779 : nucleotidyltransferase activity ( 3400 )
» GO:0003887 : DNA-directed DNA polymerase activity ( 519 )
Gene Ontology: Biological Process
• A biological process is series of events
accomplished by one or more ordered
assemblies of molecular functions. Examples of
broad biological process terms are cellular
physiological process or signal transduction.
Examples of more specific terms are pyrimidine
metabolism or alpha-glucoside transport. It
can be difficult to distinguish between a biological
process and a molecular function, but the general
rule is that a process must have more than one
distinct steps.
•
•
•
•
•
•
•
•
•
DNA replication
Accession: GO:0006260
Ontology: biological_process
Synonyms:
– related: DNA biosynthesis
– related: DNA synthesis
Definition:
– The process whereby new strands of DNA are synthesized. The template for
replication can either be DNA or RNA.
Comment:
– See also the biological process terms 'DNA-dependent DNA replication ;
GO:0006261' and 'RNA-dependent DNA replication ; GO:0006278'.
Term Lineage
Graphical View
all : all ( 228266 )
– GO:0008150 : biological_process ( 166476 )
• GO:0009987 : cellular process ( 111929 )
– GO:0050875 : cellular physiological process ( 103960 )
» GO:0044237 : cellular metabolism ( 71681 )
» GO:0006139 : nucleobase, nucleoside, nucleotide and nucleic acid
metabolism ( 27559 )
» GO:0006259 : DNA metabolism ( 8807 )
» GO:0006260 : DNA replication ( 3202 )
Gene Ontology: Cellular Component
• A cellular component is just that, a component of
a cell, but with the proviso that it is part of some
larger object; this may be an anatomical structure
(e.g. rough endoplasmic reticulum or nucleus) or
a gene product group (e.g. ribosome,
proteasome or a protein dimer).
cellular_component
•
•
•
•
•
DNA-directed RNA polymerase II, core complex
Accession: GO:0005665
Ontology: cellular_component
Synonyms: related: DNA-directed RNA polymerase II activity
Definition:
– RNA polymerase II, one of three eukaryotic nuclear RNA polymerases, is a multisubunit
complex; it produces mRNAs, snoRNAs, and some of the snRNAs. Two large subunits
comprise the most conserved portion including the catalytic site and share similarity with
other eukaryotic and bacterial multisubunit RNA polymerases. The largest subunit of RNA
polymerase II contains an essential carboxyl-terminal domain (CTD) composed of a variable
number of heptapeptide repeats (YSPTSPS). The remainder of the complex is composed of
smaller subunits (generally ten or more), some of which are also found in RNA polymerases
I and III. Although the core is competent to mediate ribonucleic acid synthesis, it requires
additional factors to select the appropriate template.
GO:0005575 : cellular_component ( 116994 )
GO:0005623 : cell ( 86438 )
GO:0044464 : cell part ( 86397 )
GO:0005622 : intracellular ( 70018 )
GO:0044424 : intracellular part ( 69369 )
GO:0043229 : intracellular organelle ( 63194 )
GO:0043231 : intracellular membrane-bound organelle ( 58868 )
GO:0005634 : nucleus ( 12609 )
GO:0044428 : nuclear part ( 5000 )
GO:0031981 : nuclear lumen ( 3017 )
GO:0005654 : nucleoplasm ( 1990 )
GO:0044451 : nucleoplasm part ( 1791 )
GO:0016591 : DNA-directed RNA polymerase II, holoenzyme ( 462 )
GO:0005665 : DNA-directed RNA polymerase II, core complex (85)
go or no go
• Used frequently for question such as: is there
any functional pattern to my set of co-expressed
genes? (overrepresentation of a particular
process and/or complex)
•
Better than nothing.
• How are the GO terms assigned (e.g. TAS vs
IEA)
• GO slim …
• A framework / staring point
Use for questions like: what portion of
proteins does human devote to transcription
regulation: allows for such questions
• Controlled vocabulary
• Conceptual framework of thinking about our
knowledge on cellular mechanisms
E(nzyme) C(ode) number: a hierarchical
system to describe enzymatic function
•
•
•
•
•
•
EC 1 Oxidoreductases
EC 2 Transferases
EC 3 Hydrolases
EC 4 Lyases
EC 5 Isomerases
EC 6 Ligases
• EC 2.7 Transferring phosphorus-containing groups
• EC 2.7.7 Nucleotidyltransferases
• EC 2.7.7.6 DNA-directed RNA polymerase
Homology ~ molecular function
• In other words re metabolic pathways, homologs
are observed to catalyze similar reactions, but
often in different pathways.
Homology ~ molecular function
So if we do function prediction using
sequence (i.e. blast, trees ect. ) then?
• If we think we see an ortholog we can transfer a
lot of aspects of function and role
• If we see only an homolog we can only transfer
some aspects of molecular function but not
process / role
Examples
• Fringe as glycosyl transferase
• ATPase family associated with various
cellular activities (AAA)
AAA family proteins often perform chaperone-like
functions that assist in the assembly, operation,
or disassembly of protein complexes
• … so how to place query genes in a process/role
then?