Download Bioinfo primer - part 6/6

Christophe Roos - MediCel ltd [email protected] High throughput data acquisition New issues in storage and analysis Annotating genomes with functional information: automatic but without errors? Genome annotation • Annotations is the sum of all non-sequence information that can be connected to any sequence Phylogenetic inference Metabolic profiles Sequence homologs in other genomes Connectors to other maps Cofactors and metabolites Metabolic map locator Gene Sequence Functional chemistry Experimental data Genome location Expression info Raw images Numerical values Cluster genes Christophe Roos - 6/6 Functional genomics Structure Raw data Electron density Structure annotation Spring 2002 SS assignments Genome annotation • Primary sources of information about what genes do are laboratory experiments. It may take several experiments for one data point. • All that data should ideallically be associated – hyperlinked among DBs. – Magpie is an environment for genome annotation • Compare genomes to learn how their structure affects function – Bacteria have modules of genes functioning together organised in ‘operons’ – Higher organisms need to pack the DNA to fit it in the nucleus. Activating a gene means unpacking and is not efficient if it is done for each gene separately Christophe Roos - 6/6 Functional genomics Spring 2002 Functional genomics • High throughput technologies give us long lists of the parts of systems (chromosomes, genomes, cells, etc). We can now analyse how they work together to produce the complexity of the organisms. • The function of the genome is – Metabolism: metabolic pathways convert chemical energy derived from food into useful work in the cell. – Regulation: regulatory pathways are biochemical mechanisms that control what genomic DNA does. It switches genes on and off in a controlled way. – Signalling: signalling pathways control the movement of information (chemicals) from one component to another on many levels – Construction • Functional genomics tries to map these pathways Christophe Roos - 6/6 Functional genomics Spring 2002 Analysing the activity of the genome • Genomics: look at transcriptional activity of genes – Transcription: When a gene is transcriptionally active, it means that messenger RNA (mRNA) is synthesised. The amount of mRNA from each active gene varies over time. – Turnover: Different mRNA species have different half-lives. – Translation: When a mRNA is produced, it does not imply that the corresponding protein is translated. Transcripts can also be produced for storage and later use. – Technically feasible: it is possible to isolate all mRNAs from cells and to quantitate it within certain limits. • Proteomics: look at proteins instead of transcripts – Limited: Presently acceptable efficiency comes at the expenses of incufficient quality – Closer to ’reality’ since the proteins are the players Christophe Roos - 6/6 Functional genomics Spring 2002 EST: Expressed sequence tags • ESTs are partial sequences of cDNA clones. cDNA clones are DNA synthesised in vitro using mRNA as template. – Why? cDNA is more stable than mRNA – How? cDNA can be made ‘en masse’ starting from total cellular mRNA isolates. cDNA libraries are specific for tissue, developmental time, stimulation etc. – Therefore, looking at cDNA is looking at mRNA is looking at active genes. – To look at cDNA means sequencing (part of) it. • Clones are picked at random (10’000-200’000) • Sequenced from one or both ends once (no proofreading) • Sequences entered into EST sequence databases Christophe Roos - 6/6 Functional genomics Spring 2002 EST: Expressed sequence tags • • • • constucting a clone by inserting a piece of DNA into a ’vector’. the vector and its insert will behave as an independent unit (’plasmid’) in the bacterial host and carries some additional genes to allow for selection (only those bacterial with the vector will survive on antibiotics) Amplify and sequence Iterate (in parallell) Christophe Roos - 6/6 Functional genomics Spring 2002 DNA hybridisation • DNA is a double-helix and can be separated by denaturing treatment into two strands. Each strand becomes ’sticky’ and attempts to renature with homologous single-strand sequences to form hybrids. • Single-strand DNA from all known genes of a given species can be attached to a matrix, then probed with labelled cDNA molecules from a given sample. Only complementary probes will hybridise and can be detected if they have been previously labelled (radioactivity, fluorescent stain, ...) • The technique can be multiplexed: – High density arrays carrying sticky probes from a full genome – Parallel hybridisation with cDNA from various sources Christophe Roos - 6/6 Functional genomics Spring 2002 The process of using microarrays Building the Chip: PCR PURIFICATION and PREPARATION MASSIVE PCR PREPARING SLIDES PRINTING Preparing RNA: CELL CULTURE AND HARVEST Hybridising the Chip: POST PROCESSING ARRAY HYBRIDIZATION RNA ISOLATION DATA ANALYSIS cDNA PRODUCTION Christophe Roos - 6/6 Functional genomics PROBE LABELING Spring 2002 The output: the image raw data cDNA is prepared from two samples (in this example) and labelled, each sample with a distinct color. Then the array is hybridised with the doubble probe and the signal is recorded as images overlay images and normalise scanning laser 2 laser 1 emission Christophe Roos - 6/6 Functional genomics analysis Spring 2002 Problems in image analysis • Noise • Spot detection and intensity • Alignment if overlay Christophe Roos - 6/6 Functional genomics Spring 2002 A set of experiments on yeast... • Each row represents one gene • Each column represents one experiment – The columns have been organised into related sets of experiments (ALPH, ELU,...) • The colors indicate gene activity (from high to absent) Christophe Roos - 6/6 Functional genomics Spring 2002 Clustering the resulting data • Looking at 10’000 genes is not easy • Group genes into clusters of genes that behave the same way over a set of several experiments – – – – Hierarchical clustering K-means clustering Self-organising maps (SOM) Etc. Christophe Roos - 6/6 Functional genomics Spring 2002 The overall process with microarrays • Microarray data has to be used in a larger frame of experimentation Christophe Roos - 6/6 Functional genomics Spring 2002 Making a model of the data Sequence Interaction Genome 1. 2. 3. Elements Binary relations Networks Christophe Roos - 6/6 Functional genomics  Structure  Network  Transcriptome  Function  Function  Proteome Assembly Neighbour Cluster Pathway Genome Hierarchical Tree Spring 2002 Comparing networks Pathway vs. Pathway • Gain new biological information by comparison of networks • What is the metrics? • How is it done? Is it simply a problem of graph isomorphism Pathway vs. Genome Genome vs. Genome Cluster vs. Pathway Christophe Roos - 6/6 Functional genomics Spring 2002 Biological graph comparison • Search heuristically for clusters of correspondence Graph 1 A C D B G E I Correspondences K H F J A C D E I H A B C D . . . . a b c d . . Graph 2 a d i h K f j a c i j Spring 2002 k h F b g e J Christophe Roos - 6/6 Functional genomics k d G b g e Clustering algorithm B c f Example: genomic, metabolic, structural Genome-pathway comparison, which reveals the correlation of physical coupling of genes in the genome - operon structure (a) and functional coupling (b) of gene products in the pathway E. coli genome hisL hisG hisD yefM Christophe Roos - 6/6 Functional genomics hisC hisB hisH hisA hisF hisI yzzB Spring 2002 Example: genomic, metabolic, structural HISTIDINE METABOLISM Pentose phosphate cycle 5P-D-1-ribulosylformimine 3.5.1.- Phosphoribosyl-AMP PRPP 3.6.1.31 2.4.2.17 3.5.4.19 PhosphoribulosylFormimino-AICARP 2.4.2.- 5.3.1.16 PhosphoribosylFormimino-AICAR-P Phosphoriboxyl-ATP 2.6.1.- Imidazoleacetole P 2.6.1.9 4.2.1.19 ImidazoleGlicerol-3P 3.1.3.15 L-Histidinol-P 5P Ribosyl-5-amino 4Imidazole carboxamide (AICAR) 1.1.1.23 1-MethylL-histidine 3.4.13.5 Aneserine 6.3.2.11 Purine metabolism 2.1.1.- 6.3.2.11 3.4.13.3 3.5.3.5 Imidazolone acetate 3.5.2.- Imidazole4-acetate 1.14135 Christophe Roos - 6/6 Functional genomics 3.4.13.2 0 Imidazole acetaldehyde 1.2.1.3 Histamine 1.4.3.6 L-Hisyidinal 2.1.1.22 Carnosine N-Formyl-Lspartate L-Hisyidinal 1.1.1.23 6.1.1 Hercyn 4.1.1.22 4.1.1.28 L-Histidine Spring 2002 Example: genomic, metabolic, structural SCOP hierarchical tree……..NE, TYROSINE AND TRYPTOPHAN BIOSYNTHESIS 1. 2. 3. All alpha All beta Alpha and beta (a/b) 3.1 beta/alpha (TIM)-barrel 3.2 Cellulases . . . . . . . 3.74 Thiolase 3.75 Cytidine deaminase 4. Alpha and beta (a+b) 5. Multi-domain (alpha and beta) 6. Membrane and cell surface pro 7. Small proteins RNA 8. Peptides 9. Designed proteins 10. Non-protein 2.5.1.19 3-deoxyD-arabinoheptonate 1.1.1.24 1.3.1.43 4.2.1.51 4.2.1.10 4.2.1.11 1.1.9925 2.6.1.57 Pretyrosine 4.2.1.91 1.4.1.20 6.1.1.20 2.6.1.5 Phenylalanine Phenylpyruvate 2.6.1.1 2.6.1.9 2.6.1.57 4.1.3.27 Histidine 1.1.9925 2.6.1.9 2.6.1.57 4-Aminobenzoate 2.6.1.5 2.6.1.9 2.6.1.57 Prephenate 4.2.1.51 Indole 5.4.99.5 2.4.2.18 N-(5-Phosphob-v-ribosyl)anthranilate 4.2.1.20 5.3.1.24 4.1.1.48 1-(2- CarboxyPhenylamino)1-deoxy-D-ribulose 5-phosphate 4.2.1.20 (3-Indolyl)Glycerol phosphate L-Tryptophan Tryptophan metabolism Ubiquinone biosynthesis 3-Dehydro- Protocatechuate shikimate Folate biosynthesis Christophe Roos - 6/6 Functional genomics 1.4.3.2 2.6.1.1 4.2.1.91 4.1.3.- 4.2.1.10 2.6.1.5 4.2.1.20 1.4.3.2 4.6.1.4 2.6.1.1 4-Hydroxyphenylpyruvate 1.14.16.1 Shikimate 1.1.1.25 Alkaloid biosynthesis I 6.1.1.1 Tyrosine Anthranilate 4.6.1.3 3-Dehydroquinate Tyr-tRNA Chorismate 2.7.1.71 Tyrosine metabolism Spring 2002 More challenges? The list of genes being activated or inactivated or that are unaffected when comparing two samples becomes more informative if the genes can be mapped onto maps from which functions can be deduced. Christophe Roos - 6/6 Functional genomics Spring 2002 More challenges? Christophe Roos - 6/6 Functional genomics Spring 2002

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Bioinfo primer - part 6/6