* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download An investigation of conserved coexpression amongst seven
Long non-coding RNA wikipedia , lookup
Public health genomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Designer baby wikipedia , lookup
Essential gene wikipedia , lookup
Non-coding DNA wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genome (book) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Heritability of IQ wikipedia , lookup
Koinophilia wikipedia , lookup
Helitron (biology) wikipedia , lookup
Metagenomics wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene expression programming wikipedia , lookup
Pathogenomics wikipedia , lookup
Microevolution wikipedia , lookup
Ridge (biology) wikipedia , lookup
Gene expression profiling wikipedia , lookup
An investigation of conserved coexpression in bacteria Nels Thorsteinson Research and Training Centre on Bioinformatics Institute for Information Transmission Problems Russian Academy of Sciences Биоинформатика Introduction • Coexpression – groups of genes with similar expression profiles – measured by Pearson correlation – involved in similar functions • Conserved coexpression – groups of genes which are coexpressed in multiple species – involved in core biological processes Methods • Public data from GEO, Array Express, Stanford Escherischia coli Bacillus subtilis Mycobacterium tuberculosis Vibrio cholera Streptococcus pneumonia Campylobacter jejuni Streptomyces coelicor • NCBI’s COG database – Orthologue assignment • STRING database – Evaluation of coexpression networks Figure 1: Similarity of single genome coexpression sets a b 0.16 Pearson correlation between coexpression sets Pearson correlation between coexpression sets 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 -0.02 0 20 40 60 evolutionary distance 80 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 -0.02 61.5 62 62.5 63 63.5 evolutionary distance 64 64.5 Figure 2: Correlation of coexpression sets to STRING's neighbourhood score 0.25 conserved coexpression correlation 0.2 averaged single genomes 0.15 0.1 0.05 0 E V C M B S S 1 2 3 4 5 6 7 number of genomes Figure 3: Correlation of conserved coexpression sets to STRING's neighbourhood score b 1.8 fold difference of correlations fold difference of correlations a 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 20 40 60 evolutionary distance 80 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 61 62 63 64 evolutionary distance 65 Methods • Functional classification of the genes in the conserved coexpression network Functional Classification Number of genes Translation, ribosomal 47 Energy production 9 Transcription 7 Carbohydrate transport 6 Intracellular trafficking 4 • Only one third of gene pairs consist of genes belonging to the Cell operon motility 3 same Posttranslational modification 2 Amino acid 2 Replication, recombination 1 Conclusion • The more genomes used when calculating a conserved coexpression network, the higher the correlation to functional interactions • The further the distance between the species for which a conserved coexpression network is calculated, the higher the correlation of the resulting network to functional interactions • Presented conserved coexpression network Acknowledgements Mikhail Gelfand Anya Gerasimova Alexey Kazakov Artem Cherkasov Research and Training Centre on Bioinformatics Institute for Information Transmission Problems Russian Academy of Sciences Биоинформатика