Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Analyzing transcription modules in the pathogenic yeast Candida albicans Elik Chapnik Yoav Amiram Supervisor: Dr. Naama Barkai Background (1) – C. albicans • • • • Opportunistic fungal pathogen Genome was recently sequenced Lack of sufficient annotation of genes Distant cousins: S. cerevisiae – SC is the yeast model organism – SC is used as a model to study CA – comparative genomics: what are the tools? Background (2) – Tools – monitors 1000’s of genes simultaneously – co-expression patterns can provide functional links Conditions • BLAST • DNA Microarrays • Cluster Analysis, SVD – limited size of data sets – mutually exclusive clusters – expression analyzed under all conditions Genes Background (2) – Tools • “Transcription Modules” (TMs): – a self-consistent regulatory unit – co-regulated genes and their regulating conditions • Signature Algorithm – – – – global decomposition into TMs robust, fast integration of external data if no a-priory information exists, can be applied iteratively (ISA) Better understanding of CA via SC data • • • Expression levels of SC have been measured for over 1000 conditions Emerging quantities of CA microarray experiments Genomes are both fully sequenced What can be done with all this? 1. Large scale expression analysis of CA (Dr. Barkai’s group and Prof. Judith Berman) 2. Use the homology between SC and CA − focus on selected annotated SC transcription modules − use the information from SC TMs to study CA Main goal of the project (1) Annotating C. albicans ORFs with unknown functions Measures: 1. computing pair-wise correlations between genes in TMs (Pearson correlation coefficient) Main goal of the project (2) Measures (cont.): 2. Search for cis-regulatory elements (CREs) in the upstream region of genes – – find over represented sequence in the upstream region of genes in the SC modules, using computational DNA pattern recognition methods search for previously identified cis-regulatory elements in the CA homologue modules Tools and methods • Programming software: MATLAB 6.5 • Cluster analysis tools: GeneHopping • Sequence data: Stanford Genome Technology center • Expression data: C. albicans expression data was provided by Prof. Berman’s lab • Software for CRE prediction: MEME, TESS, EPD, CONSENSUS Generating modules Candida Homologue Module Yeast Module BLAST signature algorithm And the modules are: -0.2 0 -0.1 0 0.1 0.2 0.3 0.4 0.5 Candida Refined Module 0.6 0.7 1 Identifying co-regulation Candida Refined Module Candida Homologue Module Yeast Module Find all pair-wise correlation in the module genes using the Pearson correlation coefficient Apply statistical significance tests: generate random modules to compute Z-scores Average Correlation+ Z-score > Average Correlation+ Z-score < Average Correlation+ Z-score Statistical analysis 1. Generate random modules by reshuffling genes in whole genome database 2. Compute average correlations for the random and “real” modules 3. Calculate mean and standard deviation from random modules set 4. Calculate Z-scores of “real” modules 5. High Z-score (>2) represents a statistically significant correlated module Two slides ago… Candida Homologue Module Yeast Module BLAST signature algorithm Candida Refined Module Identification of cis-regulatory elements Rejected Candida Homologue Module Yeast Module Overlapped Included Find common CRE in Yeast Module Rejected Included Overlapped Candida Refined Module Identification of cis-regulatory elements Rejected Yeast CRE Candida Homologue Module CRE ? Module Overlapped Included our prediction for CRE % and Mean CRE in each module CRE Rejected CRE Included CRE Overlapped Candida Refined Module Results – co-regulation of SC aa Module Average Correlation 0.34816 Z-Score = 106.9 Results – co-regulation of modules Module type S. cerevisiae Module name C. albicans homologue module C. albicans refined module 0.9-1.0 0.8-0.9 Amino acid Biosynthesis 0.34816 ± 0.0029 [106.9] 0.043325± 0.0038 [7.5693] 0.26942± 0.0082 [31.038] Cell Cycle G1 0.2921± 0.0028 [90.0693] 0.0475± 0.0047 [7.0945] 0.18± 0.0079 [20.926] rRNA Processing 0.674± 0.0045 [142.113] 0.3216± 0.0051 [60.2796] 0.3097± 0.0023 [127.507] Proteosome Subunits 0.4211± 0.0054 [71.2679] 0.1611± 0.0078 [18.8772] 0.2342± 0.0045 [48.9743] 0.7-0.8 0.6-0.7 0.5-0.6 0.4-0.5 0.3-0.4 0.2-0.3 0.1-0.2 0.0-0.1 Mean Correlation± Standard Deviation [Z-Score] Results – co-regulation between SC modules Amino acid Biosynthesis (13.7) Cell Cycle G1 (12.9) rRNA Processing (12.6) Proteosome subunits (11.31) Amino acid Biosynthesis (13.7) --- -0.0216± 0.0017 [-35.0476] 0.0042± 0.0025 [-13.9315] 0.0337± 0.0031 [-1.6166] Cell Cycle G1 (12.9) -0.0216± 0.0017 [-35.0476] --- 0.0779± 0.0024 [16.1595] 0.0203± 0.0025 [-7.2475] rRNA Processing (12.6) 0.0042± 0.0025 [-13.9315] 0.0779± 0.0024 [16.1595] --- -0.1241± 0.0033 [-48.9049] Proteosome subunits (11.31) 0.0337± 0.0031 [-1.6166] 0.0203± 0.0025 [-7.2475] -0.1241± 0.0033 [-48.9049] --- Modules are anti-regulated Modules are co-regulated Results – co-regulation between CA modules Amino acid Biosynthesis (13.7) Cell Cycle G1 (12.9) rRNA Processing (12.6) Proteosome subunits (11.31) Amino acid Biosynthesis (13.7) --- -0.0078± 0.0051 [-4.5555] 0.0622± 0.0032 [14.8978] -2.02E-04± 0.0041 [-3.5271] Cell Cycle G1 (12.9) -0.0078± 0.0051 [-4.5555] --- 0.0117± 0.0034 [-0.9320 0.0341± 0.0041 [4.7324] rRNA Processing (12.6) 0.0622± 0.0032 [14.8978] 0.0117± 0.0034 [-0.9320] --- -0.0028± 0.0026 [-6.6787] Proteosome subunits (11.31) -2.02E-04± 0.0041 [-3.5271] 0.0341± 0.0041 [4.7324] -0.0028± 0.0026 [-6.6787] --- Modules are anti-regulated Modules are co-regulated Results - cis-regulatory elements in the aa modules Rejected Yeast Module Candida Homologue 34%, 1.06 Module 52%, CRE1.18 ? CRE 46%, 1.25 Overlapped Included CRE 54%, 1.29 TGACTC CRE %, Mean CRE Rejected CRE 29%, 1.00 Included CRE 53%, 1.22 Overlapped Candida Refined Module Results – cis-regulatory elements chart Module type S. cerevisiae C. albicans homologue module Rejected genes Included genes Overlapped genes C. albicans refined module Amino acid Biosynthesis 156 46% 1.25 98 34% 1.06 77 29% 1 13 54% 1.285 21 52% 1.181 34 53% 1.222 rRNA Processing 12.6 61 67% 1.585 55 42% 1.304 9 44% 1.25 219 32% 1.225 46 41% 1.315 265 34% 1.24 Protesosome subunits 10.14 41 37% 1 37 19% 1.428 11 18% 1 38 16% 1.166 26 19% 1.6 64 17% 1.363 Protesosome subunits 11.31 45 62% 1.071 39 23% 1 13 23% 1 38 13% 1 26 23% 1 64 17% 1 Cell Cycle G1 12.9 124 59% 1.41 71 46% 1 52 42% 1 14 29% 1 19 58% 1 33 45% 1 Cell Cycle G1 16.4 158 52% 1.378 88 45% 1.025 67 40% 1.037 13 23% 1 21 62% 1 34 47% 1 Module name # of Genes CRE % Mean CRE Conclusions • Co-regulation: – Different co-regulation schemes can point out alternative gene function between SC and CA – Investigate the relations between “real” CA modules and refined CA modules with a similar annotation • cis-regulatory elements: – CRE as a function of homology – CRE as a function of co-regulation – Low expression of SC CRE as an indicator for biological importance – Not all CREs are conserved between the organisms: GCN4 vs. GAL4 Future research tasks • Experimental validation of functional assignment: – verify if the cis-regulatory elements found in C. albicans are biologically active – test the conservation of function across homologue modules of S. cerevisiae and C. albicans Acknowledgements • Naama Barkai – Weizmann Institute • Judith Berman – University of Minnesota • Sven Bergmann – Barkai’s group • Jan Ihmels – Barkai’s group