Download An investigation of conserved coexpression amongst seven

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Long non-coding RNA wikipedia , lookup

Public health genomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genomics wikipedia , lookup

Polyploid wikipedia , lookup

Designer baby wikipedia , lookup

Essential gene wikipedia , lookup

Non-coding DNA wikipedia , lookup

Twin study wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genome (book) wikipedia , lookup

Genomic imprinting wikipedia , lookup

Gene wikipedia , lookup

RNA-Seq wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Heritability of IQ wikipedia , lookup

Koinophilia wikipedia , lookup

Helitron (biology) wikipedia , lookup

Metagenomics wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene expression programming wikipedia , lookup

Pathogenomics wikipedia , lookup

Microevolution wikipedia , lookup

Ridge (biology) wikipedia , lookup

Gene expression profiling wikipedia , lookup

Minimal genome wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
An investigation of conserved
coexpression in bacteria
Nels Thorsteinson
Research and Training Centre on Bioinformatics
Institute for Information Transmission Problems
Russian Academy of Sciences
Биоинформатика
Introduction
• Coexpression
– groups of genes with similar
expression profiles
– measured by Pearson correlation
– involved in similar functions
• Conserved coexpression
– groups of genes which are coexpressed in multiple species
– involved in core biological processes
Methods
• Public data from GEO, Array Express, Stanford
Escherischia coli
Bacillus subtilis
Mycobacterium tuberculosis
Vibrio cholera
Streptococcus pneumonia
Campylobacter jejuni
Streptomyces coelicor
• NCBI’s COG database
– Orthologue assignment
• STRING database
– Evaluation of coexpression networks
Figure 1: Similarity of single genome coexpression sets
a
b
0.16
Pearson correlation between
coexpression sets
Pearson correlation between
coexpression sets
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
-0.02
0
20
40
60
evolutionary distance
80
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
-0.02
61.5
62
62.5
63
63.5
evolutionary distance
64
64.5
Figure 2: Correlation of coexpression sets
to STRING's neighbourhood score
0.25
conserved
coexpression
correlation
0.2
averaged single
genomes
0.15
0.1
0.05
0
E
V
C
M
B
S
S
1
2
3
4
5
6
7
number of genomes
Figure 3: Correlation of conserved coexpression sets to STRING's neighbourhood
score
b
1.8
fold difference of correlations
fold difference of correlations
a
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
20
40
60
evolutionary distance
80
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
61
62
63
64
evolutionary distance
65
Methods
• Functional classification of the genes in the conserved
coexpression network
Functional Classification
Number of genes
Translation, ribosomal
47
Energy production
9
Transcription
7
Carbohydrate transport
6
Intracellular
trafficking
4
• Only
one third
of gene pairs consist of genes belonging
to the
Cell operon
motility
3
same
Posttranslational modification
2
Amino acid
2
Replication, recombination
1
Conclusion
• The more genomes used when calculating a conserved
coexpression network, the higher the correlation to functional
interactions
• The further the distance between the species for which a
conserved coexpression network is calculated, the higher the
correlation of the resulting network to functional interactions
• Presented conserved coexpression network
Acknowledgements
Mikhail Gelfand
Anya Gerasimova
Alexey Kazakov
Artem Cherkasov
Research and Training Centre on Bioinformatics
Institute for Information Transmission Problems
Russian Academy of Sciences
Биоинформатика