Download K. lactis E. gossypii D. hansenii C. glabrata C

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Fatty acid synthesis wikipedia , lookup

Butyric acid wikipedia , lookup

Molecular ecology wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Epitranscriptome wikipedia , lookup

Metabolism wikipedia , lookup

Peptide synthesis wikipedia , lookup

Protein structure prediction wikipedia , lookup

Hepoxilin wikipedia , lookup

Point mutation wikipedia , lookup

Amino acid wikipedia , lookup

Biochemistry wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Transfer RNA wikipedia , lookup

Biosynthesis wikipedia , lookup

Genetic code wikipedia , lookup

Transcript
10%
A. fumigatus
0%
20%
20%
C. albicans
10%
10%
0%
0%
20%
20%
10%
C. glabrata
0%
20%
10%
0%
20%
10%
0%
S. mikatae
10%
0%
20%
S. paradoxus
10%
E. gossypii
0%
20%
S. cerevisiae
10%
20%
D. hansenii
S. bayanus
0%
20%
K. lactis
10%
10%
0%
[0..5[
[5..10[
[10..15[
[15..20[
[20..25[
[25..30[
[30..35[
[35..40[
[40..45[
[45..50[
[50..55[
[55..60[
[60..65[
[65..70[
[70..75[
[75..80[
[80..85[
[85..90[
[90..95[
[95..100[
[100..105[
[105..110[
[110..115[
[115..120[
[120..+0[
0%
S. pombe
[0..5[
[5..10[
[10..15[
[15..20[
[20..25[
[25..30[
[30..35[
[35..40[
[40..45[
[45..50[
[50..55[
[55..60[
[60..65[
[65..70[
[70..75[
[75..80[
[80..85[
[85..90[
[90..95[
[95..100[
[100..105[
[105..110[
[110..115[
[115..120[
[120..+0[
20%
Figure S1
Three-codon contexts are species-specific. Codon-triplets were divided into 25
groups according to their usage in the ORFeomes. Observed and expected values were
plotted as bars and lines, respectively. The relative size of each bar is a measure of the
strength codon-triplets bias. The data shows that codon-triplets were not evenly
distributed in ORFeomes since the majority was represented between 5 and 30 times
(middle bars of all histograms) while few appeared more then 50 times in the same
species (right half of all histograms). The C. albicans ORFeome originated the most
divergent histogram, with very high number of frequent triplets (red bars: triplets that
appeared more than 120 times in one ORFeome).
tRNA adaptation to codon usage
1,4
DAQ
1,2
1,0
S.pombe
S.paradoxus
S.mikatae
S.cerevisiae
S.bayanus
K.lactis
E.gossypii
D.hansenii
C.glabrata
C.albicans
0,6
A.fumigatus
0,8
Figure S2
C. albicans tRNAs and codon usage are unbalanced. In order to test whether
tRNAs and codon usage were unbalanced in C. albicans, Relative Synonymous
Codon Usage (RSCU) values for all codons [24] and Relative Isoacceptor Usage
(RIU) values, which measures tRNA availability, were calculated. The later was
calculated as for RSCU but using gene copy number values, thus determining the
fraction of isoacceptors that are utilized, i.e. the gene copy number of each
tRNA, divided by the expected number assuming equilibrium between all
isoacceptors for an amino acid (see Methods). The correlation between tRNA
availability and codon usage is expressed in the figure, representing the
Decoding Adaptation Quotient (DAQ=RSCU/RIU). DAQ values closer to 1
indicate high adaptation between tRNAs and codon usage. The C. albicans
ORFeome was the least adapted of the 11 analyzed since it originated the lowest
DAQ value. This indicates that this species prefers codons that are decoded by
near-cognate tRNAs (see main text for a more detailed explanation).
GLN
ASN
SER
HIS
THR
LEU LEU
LEU
CAA CAG AAC AAT AGC AGT TCA TCC TCG TCT CAC CAT ACA ACC ACG ACT CTA CTC CTG CTT TTA TTG
A.fum
C.alb
C.gla
D.han
E.gos
K.lac
S.bay
S.cer
S.mik
S.par
S.pom
Figure S3A
Codon repeat biases are species and amino acid specific. When individual codon
repeat bias was studied using the amino acid repeats methodology (Figure 6), C.
albicans and the other species behaved differently. Both codons for one amino acid
(Asn and His) or specific codons of a group of synonymous (Gln, Ser or Thr) were
highly represented in long strings in the C. albicans ORFeome (codons highlighted in
grey at the top of the diagram). Interestingly, the serine-CTG codon in C. albicans
was also preferred in long strings, as was the case of other serine codons, but leucine
codons did not form long repetitions (amino acids highlighted in red). The preference
of codon repeats in C. albicans was mainly related with a subset of codons, namely:
Gln-CAA; Asn-AAC; Asn-AAT; Ser-AGT; Ser-TCA; Ser-TCT; His-CAC; His-CAT;
Thr-ACA; Thr-ACT; and Ser/Leu-CTG.
LYS
PRO
GLY
PHE
ARG
ILE
AAA AAG CCA CCC CCG CCT GGA GGC GGG GGT TTC TTT AGA AGG CGA CGC CGG CGT ATA ATC ATT
A.fum
C.alb
C.gla
D.han
E.gos
K.lac
S.bay
S.cer
S.mik
S.par
S.pom
Figure S3B
Codon repeat biases are species and amino acid specific. When individual codon
repeat bias was studied using the amino acid repeats methodology (Figure 6), C.
albicans and the other species behaved differently. Amino acids whose codon
repeats were repressed are shown here. These codons (highlighted in yellow at the
top of the diagram) belong to families of synonymous codons that code for amino
acids that are repressed also in amino acid strings (Phe, Ile, Leu – Figure 6) or are
codons of homopolymeric nucleotide runs (AAA, CCC, GGG, TTT), suggesting that
they are related to increased decoding error, namely frameshifting and ribosome
drop-off.
A
CTN in absent triplets
0,035
0,03
0,025
0,02
0,015
0,01
0,005
0
B
CTN in unique absent triplets
0,07
0,06
0,05
0,04
0,03
0,02
CTG
CTC
CTA
S.pombe
S.paradoxus
S.mikatae
S.cerevisiae
S.bayanus
K.lactis
E.gosssypii
D.hansenii
C.glabrata
C.albicans
0
A.fumigatus
0,01
CTT
Figure S4
CTG codons in absent triplets of C. albicans and D. hansenii. Since CTG codons are
decoded in two different ways among the group of 11 fungal species studied, we
investigated whether its unusual decoding as serine altered its presence in triplets. For
that, CTG usage in codon-triplets that vanished from ORFeomes (panel A) was
calculated and compared with the result obtained for other CTN codons. Remarkably,
the non-standard decoders C. albicans and D. hansenii were the only species that
showed strong rejection of CTGs among the other CTN codons. The importance of
CTG was maximal in triplets that vanished from D. hansenii ORFeome only (panel B).