Download Analyzing transcription modules in the pathogenic yeast C. albicans

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Analyzing transcription modules in the
pathogenic yeast Candida albicans
Elik Chapnik
Yoav Amiram
Supervisor:
Dr. Naama Barkai
Background (1) – C. albicans
•
•
•
•
Opportunistic fungal pathogen
Genome was recently sequenced
Lack of sufficient annotation of genes
Distant cousins: S. cerevisiae
– SC is the yeast model organism
– SC is used as a model to study CA
– comparative genomics: what are the tools?
Background (2) – Tools
– monitors 1000’s of genes
simultaneously
– co-expression patterns can
provide functional links
Conditions
• BLAST
• DNA Microarrays
• Cluster Analysis, SVD
– limited size of data sets
– mutually exclusive clusters
– expression analyzed under all conditions
Genes
Background (2) – Tools
• “Transcription Modules” (TMs):
– a self-consistent regulatory unit
– co-regulated genes and their regulating conditions
• Signature Algorithm
–
–
–
–
global decomposition into TMs
robust, fast
integration of external data
if no a-priory information exists,
can be applied iteratively (ISA)
Better understanding of CA via SC data
•
•
•
Expression levels of SC have been measured for over
1000 conditions
Emerging quantities of CA microarray experiments
Genomes are both fully sequenced
What can be done with all this?
1. Large scale expression analysis of CA (Dr. Barkai’s
group and Prof. Judith Berman)
2. Use the homology between SC and CA
− focus on selected annotated SC transcription modules
− use the information from SC TMs to study CA
Main goal of the project (1)
Annotating C. albicans ORFs with
unknown functions
Measures:
1. computing pair-wise
correlations between
genes in TMs (Pearson
correlation coefficient)
Main goal of the project (2)
Measures (cont.):
2. Search for cis-regulatory elements (CREs) in
the upstream region of genes
–
–
find over represented sequence in the upstream
region of genes in the SC modules, using
computational DNA pattern recognition methods
search for previously identified cis-regulatory
elements in the CA homologue modules
Tools and methods
• Programming software:
MATLAB 6.5
• Cluster analysis tools:
GeneHopping
• Sequence data: Stanford
Genome Technology center
• Expression data: C. albicans
expression data was provided
by Prof. Berman’s lab
• Software for CRE prediction:
MEME, TESS, EPD,
CONSENSUS
Generating modules
Candida
Homologue
Module
Yeast
Module
BLAST
signature
algorithm
And the modules are:
-0.2
0
-0.1
0
0.1
0.2
0.3
0.4
0.5
Candida
Refined Module
0.6
0.7
1
Identifying co-regulation
Candida
Refined
Module
Candida
Homologue
Module
Yeast
Module
Find all pair-wise correlation in the module genes using the Pearson
correlation coefficient
Apply statistical significance tests:
generate random modules to compute Z-scores
Average
Correlation+
Z-score
>
Average
Correlation+
Z-score
<
Average
Correlation+
Z-score
Statistical analysis
1. Generate random modules by reshuffling
genes in whole genome database
2. Compute average correlations for the
random and “real” modules
3. Calculate mean and standard deviation
from random modules set
4. Calculate Z-scores of “real” modules
5. High Z-score (>2) represents a
statistically significant correlated module
Two slides ago…
Candida
Homologue
Module
Yeast
Module
BLAST
signature
algorithm
Candida
Refined Module
Identification of cis-regulatory elements
Rejected
Candida
Homologue
Module
Yeast
Module
Overlapped
Included
Find common
CRE in Yeast
Module
Rejected
Included
Overlapped
Candida
Refined
Module
Identification of cis-regulatory elements
Rejected
Yeast
CRE
Candida
Homologue
Module
CRE ?
Module
Overlapped
Included
our prediction
for CRE % and
Mean CRE in
each module
CRE
Rejected
CRE
Included
CRE
Overlapped
Candida
Refined
Module
Results – co-regulation of SC aa Module
Average Correlation
0.34816
Z-Score = 106.9
Results – co-regulation of modules
Module type
S. cerevisiae
Module name
C. albicans
homologue
module
C. albicans
refined module
0.9-1.0
0.8-0.9
Amino acid
Biosynthesis
0.34816 ±
0.0029
[106.9]
0.043325±
0.0038
[7.5693]
0.26942±
0.0082
[31.038]
Cell Cycle G1
0.2921±
0.0028
[90.0693]
0.0475±
0.0047
[7.0945]
0.18±
0.0079
[20.926]
rRNA Processing
0.674±
0.0045
[142.113]
0.3216±
0.0051
[60.2796]
0.3097±
0.0023
[127.507]
Proteosome
Subunits
0.4211±
0.0054
[71.2679]
0.1611±
0.0078
[18.8772]
0.2342±
0.0045
[48.9743]
0.7-0.8
0.6-0.7
0.5-0.6
0.4-0.5
0.3-0.4
0.2-0.3
0.1-0.2
0.0-0.1
Mean Correlation±
Standard Deviation
[Z-Score]
Results – co-regulation between SC modules
Amino acid
Biosynthesis
(13.7)
Cell Cycle
G1
(12.9)
rRNA
Processing
(12.6)
Proteosome
subunits
(11.31)
Amino acid
Biosynthesis
(13.7)
---
-0.0216±
0.0017
[-35.0476]
0.0042±
0.0025
[-13.9315]
0.0337±
0.0031
[-1.6166]
Cell Cycle
G1
(12.9)
-0.0216±
0.0017
[-35.0476]
---
0.0779±
0.0024
[16.1595]
0.0203±
0.0025
[-7.2475]
rRNA
Processing
(12.6)
0.0042±
0.0025
[-13.9315]
0.0779±
0.0024
[16.1595]
---
-0.1241±
0.0033
[-48.9049]
Proteosome
subunits
(11.31)
0.0337±
0.0031
[-1.6166]
0.0203±
0.0025
[-7.2475]
-0.1241±
0.0033
[-48.9049]
---
Modules are
anti-regulated
Modules are
co-regulated
Results – co-regulation between CA modules
Amino acid
Biosynthesis
(13.7)
Cell Cycle
G1
(12.9)
rRNA
Processing
(12.6)
Proteosome
subunits
(11.31)
Amino acid
Biosynthesis
(13.7)
---
-0.0078±
0.0051
[-4.5555]
0.0622±
0.0032
[14.8978]
-2.02E-04±
0.0041
[-3.5271]
Cell Cycle
G1
(12.9)
-0.0078±
0.0051
[-4.5555]
---
0.0117±
0.0034
[-0.9320
0.0341±
0.0041
[4.7324]
rRNA
Processing
(12.6)
0.0622±
0.0032
[14.8978]
0.0117±
0.0034
[-0.9320]
---
-0.0028±
0.0026
[-6.6787]
Proteosome
subunits
(11.31)
-2.02E-04±
0.0041
[-3.5271]
0.0341±
0.0041
[4.7324]
-0.0028±
0.0026
[-6.6787]
---
Modules are
anti-regulated
Modules are
co-regulated
Results - cis-regulatory elements in the aa modules
Rejected
Yeast
Module
Candida
Homologue
34%,
1.06
Module
52%,
CRE1.18
?
CRE
46%,
1.25
Overlapped
Included
CRE
54%,
1.29
TGACTC
CRE %, Mean CRE
Rejected
CRE
29%,
1.00
Included
CRE
53%,
1.22
Overlapped
Candida
Refined
Module
Results – cis-regulatory elements chart
Module type
S.
cerevisiae
C. albicans
homologue
module
Rejected
genes
Included
genes
Overlapped
genes
C.
albicans
refined
module
Amino acid
Biosynthesis
156
46%
1.25
98
34%
1.06
77
29%
1
13
54%
1.285
21
52%
1.181
34
53%
1.222
rRNA
Processing
12.6
61
67%
1.585
55
42%
1.304
9
44%
1.25
219
32%
1.225
46
41%
1.315
265
34%
1.24
Protesosome
subunits
10.14
41
37%
1
37
19%
1.428
11
18%
1
38
16%
1.166
26
19%
1.6
64
17%
1.363
Protesosome
subunits
11.31
45
62%
1.071
39
23%
1
13
23%
1
38
13%
1
26
23%
1
64
17%
1
Cell Cycle G1
12.9
124
59%
1.41
71
46%
1
52
42%
1
14
29%
1
19
58%
1
33
45%
1
Cell Cycle G1
16.4
158
52%
1.378
88
45%
1.025
67
40%
1.037
13
23%
1
21
62%
1
34
47%
1
Module name
# of Genes
CRE %
Mean CRE
Conclusions
• Co-regulation:
– Different co-regulation schemes can point out
alternative gene function between SC and CA
– Investigate the relations between “real” CA modules
and refined CA modules with a similar annotation
• cis-regulatory elements:
– CRE as a function of homology
– CRE as a function of co-regulation
– Low expression of SC CRE as an indicator for
biological importance
– Not all CREs are conserved between the organisms:
GCN4 vs. GAL4
Future research tasks
• Experimental validation of functional
assignment:
– verify if the cis-regulatory elements found in
C. albicans are biologically active
– test the conservation of function across
homologue modules of S. cerevisiae and C.
albicans
Acknowledgements
• Naama Barkai – Weizmann Institute
• Judith Berman – University of Minnesota
• Sven Bergmann – Barkai’s group
• Jan Ihmels – Barkai’s group
Related documents