* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download BiGCaT
Protein moonlighting wikipedia , lookup
Essential gene wikipedia , lookup
Human genome wikipedia , lookup
Transposable element wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
Genetic engineering wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Oncogenomics wikipedia , lookup
Copy-number variation wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene therapy wikipedia , lookup
Pathogenomics wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Public health genomics wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
History of genetic engineering wikipedia , lookup
Gene nomenclature wikipedia , lookup
Genomic imprinting wikipedia , lookup
Gene desert wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genome editing wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Helitron (biology) wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Minimal genome wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome (book) wikipedia , lookup
Microevolution wikipedia , lookup
Gene expression programming wikipedia , lookup
Genome evolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Systems Biology through Pathway Statistics Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM Diepenbeek; May 14 2004 Where Bioinformatics the cat hunts BiGCaT BiGCaT Bioinformatics, bridge between two universities Universiteit Maastricht Patients, Experiments, Arrays and Loads of Data BiGCaT TU/e Ideas & Experience in Data Handling LUC Diepenbeek Statistical Foundations BiGCaT Bioinformatics, between two research fields Cardiovascular Research BiGCaT Nutritional & Environmental Research Our usual prey: gene expression arrays Microarrays: relative fluorescense signals. Identification. Macroarrays: absolute radioactive signal. Validation. Transcriptomics: The study of genome wide gene expression on the transcriptional level Where genome wide means: >20K genes. And transcriptional level means that somehow >20K mRNA sequences have to be analyzed And >20K expression values have to be filtered, normalized, replicate treated, clustered and understood Thus no transcriptomics without bioinformatics No separate statistics?: Previous slide: “…have to be: filtered, normalized, replicate treated, clustered and understood” Don’t we have to know which genes really changed? Changed? We need statistical prove of genes changing because… Scientist ask for it. Journals ask for it. But do we really need it? No we don’t! Biologist will double check anyway Largest problem are false positives 1 in 1000 means 20 on an array! Replicate filtering gets rid of that, loosing very little power off course that needed statistical proof To understand we need pathways not single genes (or proteins) Two types of arrays Single longer (>60 mer) cDNA reporters Agilent, Incyte, custom 1 value per reporter Reference variability or multi array stats Multi short (25 mer) oligo reporters Affymetrix 16-20 values per reporter Single array statistics Systems Biology Triangle 2D-gels, antibody techniques (developing inside) Proteomics Transcriptomics microarrays, 20 k (available) Systems Biology Large scale analytical chemistry (developing outside) Metabolomics Proteomics would be: The study of genome wide gene expression on the translational level Where genome wide would mean: >20K proteins. Then proteomics does not yet exist! Protein variants derived from single genes Phosphorylation? Alternative splicing? Modification? Alternative splicing? Phosphorylation? Modification? Two types of omics Transcriptomics Microarrays Values for 20 K genes Annotation difficult Proteomics Currently only 2D+MS Only 20-50 identified proteins Annotation is identification Plus modifications Gene Ontology (GO) levels (I) The Gene Ontology (GO) project gives a consistent descriptions of gene products from different databases. Amigo browser http://www.godatabase.org/cgi-bin/go.cgi GO consortium: http://www.geneontology.org Gene Ontology (GO) levels (II) Use of GO classification -GenMAPP- GenMAPP = Gene MicroArray Pathway Profiler Program to visualize Gene Expression Data on MAPPs representing biological pathways and grouping of genes * Local MAPPs contain pathways made by specific research institutes * Gene Ontology (GO) MAPPS contain pathways with functionally related genes from the public Gene Ontology Project Example Local MAPP Example GO MAPP Local MAPP GO MAPP Understanding changes Map changed genes/proteins (quantitatively or qualitatively) to known pathways. Or use information from the Gene Ontology (GO) database Steal and smartly adapt a transcriptomics tool: GenMapp/Mappfinder Rachel will show some examples