Download Phylogenetic inference of bacterial evolutionary relationship from

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cre-Lox recombination wikipedia , lookup

RNA-Seq wikipedia , lookup

DNA barcoding wikipedia , lookup

Expanded genetic code wikipedia , lookup

Molecular ecology wikipedia , lookup

Genomic library wikipedia , lookup

Genetic code wikipedia , lookup

Non-coding DNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Community fingerprinting wikipedia , lookup

Molecular evolution wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
59º Congresso Brasileiro de Genética
Resumos do 59o Congresso Brasileiro de Genética • 16 a 19 de setembro de 2013
Hotel Monte Real Resort • Águas de Lindóia • SP • Brasil
www.sbg.org.br - ISBN 978-85-89109-06-2
31
Phylogenetic inference of bacterial
evolutionary relationship from the analysis
of genomic signature using Singular Value
Decomposition (SVD)
Castro-Oliveira, L1; Amorim, LG1; Mariano, DCB1; Santos, MA1; Soares, SC2; Miyoshi, A2; Azevedo, V1,2
Programa de Pós-Graduação em Bioinformática - ICB, UFMG, Belo Horizonte, MG; 2Programa de Pós-Graduação em
Genética - ICB, UFMG, Belo Horizonte, MG
1
[email protected]
Keywords: Phylogeny, genomic signature, SVD, CMNR group, MATLAB
Evolutionary reconstructions of the tree of life were mainly performed based in identification of the point of divergence
between species solely based in shared homologous features. However, this methodology could be very tricky due to
convergent and divergent evolution. With the advent of molecular techniques, phylogenetics was greatly improved
by the use of nucleotide differences in universal reference markers, creating the area of phylogenomics. In the postgenomic era, a second wave of changes brought new approaches to phylogenomics, which now infers the evolutionary
divergence by taking advantage of whole-genome data, like: gene content and gene order; orthology; and, DNA string
or DNA signature. Phylogenomics inferences based on DNA signature, or genomic signature, take into account the
codon usage of the coding sequences, the G+C content and the nucleotide pattern, like di-, tri- and tetra-nucleotides
frequencies. The codon usage is mainly affected by the codon/anticodon interaction force and the availability of a
given tRNA, where the adoption of AT- or GC-rich codons generates a homogeneous nucleotide pattern through the
whole genome, which is different in unrelated organisms. In this work, we analyzed the latent semantic index based on
the singular value decomposition (LSI-SVD) of a matrix containing information from the codon usage fraction of the
coding sequences (CDS). The resulting data was used as coordinates to plot the genomes in a 3-dimensional chart and
a distance matrix was generated from the absolute distances between all genomes in the Matlab® software. Finally, a
phylogenetic tree was created from the distance matrix in order to visualize the evolutionary relationships. The dataset
was composed of 65 genomes of Gram-positive and Gram-negative bacteria, and the resulting phylogenetic tree was
validated using the already studied evolutionary relationships of the bacteria from the CMNR group (Corynebacteria,
Mycobacteria, Nocardia and Rodococcus). The phylogenetic tree generated by this method shows a clear relationship
between the bacteria of those genera, in spite of the other organisms; however, a small number of species appear in
disagreement. Regarding the high G+C content of the bacteria from the CMNR group, the dataset is under update
in order to consider nucleotide frequencies (G+C, di-, tri- and tetra-nucleotides), which will separate the CMNR
group from other bacteria and raise the accuracy of the method. Finally, we intend to develop a public software
applying the methodology here used and extend the phylogenetic analysis to other bacterial genomes from NCBI.
Financial Support: CAPES, CNPq e FAPEMIG.