Download BiGCaT

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Protein moonlighting wikipedia , lookup

Essential gene wikipedia , lookup

Genomics wikipedia , lookup

Human genome wikipedia , lookup

Transposable element wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Genetic engineering wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Oncogenomics wikipedia , lookup

Copy-number variation wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

NEDD9 wikipedia , lookup

Gene therapy wikipedia , lookup

Pathogenomics wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Public health genomics wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gene nomenclature wikipedia , lookup

Genomic imprinting wikipedia , lookup

Gene desert wikipedia , lookup

Ridge (biology) wikipedia , lookup

Genome editing wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Helitron (biology) wikipedia , lookup

Gene wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Minimal genome wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genome (book) wikipedia , lookup

Microevolution wikipedia , lookup

Gene expression programming wikipedia , lookup

RNA-Seq wikipedia , lookup

Genome evolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression profiling wikipedia , lookup

Transcript
Systems Biology through
Pathway Statistics
Chris Evelo
BiGCaT Bioinformatics Group – BMT-TU/e & UM
Diepenbeek; May 14 2004
Where Bioinformatics
the cat hunts
BiGCaT
BiGCaT Bioinformatics,
bridge between two universities
Universiteit Maastricht
Patients, Experiments,
Arrays and
Loads of Data
BiGCaT
TU/e
Ideas & Experience
in Data Handling
LUC Diepenbeek
Statistical Foundations
BiGCaT Bioinformatics,
between two research fields
Cardiovascular
Research
BiGCaT
Nutritional &
Environmental
Research
Our usual prey:
gene expression arrays
Microarrays: relative
fluorescense signals.
Identification.
Macroarrays: absolute
radioactive signal.
Validation.
Transcriptomics:
The study of genome wide gene
expression on the transcriptional level



Where genome wide means: >20K genes.
And transcriptional level means that somehow
>20K mRNA sequences have to be analyzed
And >20K expression values have to be
filtered, normalized, replicate treated,
clustered and understood
Thus no transcriptomics without bioinformatics
No separate statistics?:
Previous slide: “…have to be:
filtered, normalized, replicate
treated, clustered and understood”
Don’t we have to know which genes
really changed?
Changed?
We need statistical prove of genes
changing because…
Scientist ask for it.
 Journals ask for it.

But do we really need it?
No we don’t!

Biologist will double check anyway

Largest problem are false positives
1 in 1000 means 20 on an array!
Replicate filtering gets rid of that,
loosing very little power
off course that needed statistical proof

To understand we need pathways not
single genes (or proteins)
Two types of arrays
Single longer
(>60 mer) cDNA
reporters
Agilent, Incyte,
custom
1 value per reporter
Reference variability
or multi array stats
Multi short
(25 mer) oligo
reporters
Affymetrix
16-20 values per
reporter
Single array statistics
Systems Biology Triangle
2D-gels, antibody
techniques
(developing inside)
Proteomics
Transcriptomics
microarrays, 20 k
(available)
Systems
Biology
Large scale analytical
chemistry
(developing outside)
Metabolomics
Proteomics would be:
The study of genome wide gene expression
on the translational level

Where genome wide would mean:
>20K proteins.
Then proteomics does not yet exist!
Protein variants derived from single genes
Phosphorylation?
Alternative
splicing?
Modification?
Alternative splicing?
Phosphorylation?
Modification?
Two types of omics
Transcriptomics
Microarrays
Values for 20 K genes
Annotation difficult
Proteomics
Currently only 2D+MS
Only 20-50
identified proteins
Annotation
is identification
Plus modifications
Gene Ontology (GO) levels (I)
The Gene Ontology (GO) project gives a consistent descriptions
of gene products from different databases.
Amigo browser http://www.godatabase.org/cgi-bin/go.cgi
GO consortium: http://www.geneontology.org
Gene Ontology (GO) levels (II)
Use of GO classification
-GenMAPP-
GenMAPP = Gene MicroArray Pathway Profiler
Program to visualize Gene Expression Data on MAPPs
representing biological pathways and grouping of genes
* Local MAPPs
contain pathways made by specific research institutes
* Gene Ontology (GO) MAPPS
contain pathways with functionally related genes from the public
Gene Ontology Project
Example Local MAPP
Example GO MAPP
Local MAPP
GO MAPP
Understanding changes


Map changed genes/proteins (quantitatively or
qualitatively) to known pathways.
Or use information from the Gene Ontology
(GO) database
Steal and smartly adapt a transcriptomics tool:
GenMapp/Mappfinder
Rachel will show some examples