Download Relationship between expression amount and codon usage bias

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

X-inactivation wikipedia , lookup

Ridge (biology) wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Pathogenomics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Point mutation wikipedia , lookup

Genetic engineering wikipedia , lookup

Public health genomics wikipedia , lookup

Copy-number variation wikipedia , lookup

Genomic imprinting wikipedia , lookup

NEDD9 wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Gene wikipedia , lookup

Genetic code wikipedia , lookup

Genome (book) wikipedia , lookup

Gene therapy wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

The Selfish Gene wikipedia , lookup

Helitron (biology) wikipedia , lookup

Gene desert wikipedia , lookup

Genome evolution wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Gene nomenclature wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Microevolution wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene expression programming wikipedia , lookup

Gene expression profiling wikipedia , lookup

Designer baby wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
研習報告與討論
姓名:周逸民
學號:891604
研習地點:李文雄實驗室, University of Chicago
Codon Usage Bias Differences and Expression
Amount Differences in Yeast Duplicated Genes
Yi-Ming Chou*†, I-Fan Tsai*†, Zhenglong Gu§ and Wen-Hsiung Li§
*Department of Life Science, National Tsing-Hua University, Hsinchu
§Department of Ecology and Evolution, University of Chicago
†These authors contributed equally to this work.
Abstract
We examined the relationship between codon usage biases, expression and Ks data
from the yeast 2 gene family, multiple gene family and big gene family. We found
that in the gene pairs of yeast 2 gene & multiple gene family, if one gene in the pair
has a significant large codon usage bias, then the codon usage bias difference
between this pair of gene would tend to be large, which means that if one of the
duplicated important gene is somehow mutated, it will be relaxed from the codon
usage constrains. Another thing we have found is that in yeast 2 gene & multiple
gene family, if we group the genes according to its Ks value (Ks≤0.75, 0.75<Ks≤1.5,
Ks>1.5), the average/median codon usage difference of the grouped pairs tend to
increase while the Ks of the pairs increase. We also found that in each yeast big gene
family, 2 gene family and all gene family, the codon usage bias generally increased as
expected while expression amount increased. Another examination was done on the
yeast 2 gene family, as we looked at the differences of codon usage bias and
expression amount instead of the raw data, we generally found that the expression
amount differences are proportional to codon usage bias difference. But this
relationship is not definite since high expression difference may not affect codon
usage bias.
Introduction
There are many duplicated genes in specie yeast, which is called a gene family. And
it’s known that higher expression rate will generally cause the codon usage bias to be
higher, also it has been reported that highly expressed genes evolved slowly. So we
want to see that if there is any other relationship among different yeast genomic data,
say codon usage bias difference, expression amount difference and sequential
difference in yeast gene families. We suppose that the if a highly constrained gene
was duplicated, then the new copy of that gene might not need to maintain that strong
codon usage bias, since there are another copy functioning fine. So it’s constrains
might be released by mutation. Thus, the expression amount of the released gene copy
might also drops.
Materials & Methods
Data sets
All the data sets come from the yeast strain S. cerevisiae and were downloaded from
Internet databases. The gene expression data is measured by an Affymetrix microarray
and the sequence data comes from NCBI GenBank database. The sequence data is
further processed into Ks and Kas. The codon usage bias indices used here are
CAI(codon adaptation index) and ENC(effective number of codons). These data are
also calculated from statistic data of the yeast genome on the Internet. The gene
families studied were the yeast 2 gene family which have only two genes in a single
family, all gene family with 50% or greater similarities between the gene pairs and big
gene family, which have more then 5 genes in a single gene family.
Data normalization
The expression rate difference values could range from very low (~100) to very high
(~104), thus some median differences might be significant changes if the original
expression rate is very low or zero, but on the other hand, it might also be non
meaningful at all, since a median expression difference could be only a measurement
error or a natural fluctuation to highly expressed genes. So we must apply a data
normalization process. At here we only use a very simple way: we divide the blunt
difference by the average of the expression data of the gene pair to generate a
“normalized” difference data, which has a maximal of 2 and minimal of 0(if fetches
all values as absolute value).
Ks values
When one comes to analyze the Ks values, one often groups them into three: one has
values smaller then 0.75, one has values between 0.75 and 1.5, and one has values
larger then 1.5. In the tests we have done here, ones that involves Ks values all have
its genes categorized into three groups following the Ks criteria described above, and
the grouped data are represented by both group average and group median.
Graph
We simply plot the data points as x-y relation graph and histograms to see whether
there is a relationship within those values.
Significant test ant other statistical data
The p values (significant test) and all the other statistic data were obtained by
functions in Matlab build-in statistic toolbox.
Results
Frequency distribution
The occurrence frequency distribution of codon usage bias difference and expression
difference for 2 gene family and all gene family have shown that the frequency
generally decreases with the increment of differences. This is very trivial since small
deviation is more likely to occur then large differences.
Fig.1 The frequency distribution of differences in codon usage bias and expression data of the yeast 2
gene family (left) and all gene family (right). Generally the occurrence frequency decreases as
the difference becomes bigger.
Relationship between codon usage bias and codon usage bias difference
We can see from the graph that the codon usage bias difference is larger when one of
the genes in the gene pair has a larger codon usage bias. This phenomenon could be
observed both in the yeast 2 gene family and all gene family. We think this is because
once the highly biased gene (usually an important gene which is highly expressed) is
duplicated, if it is not advantageous nor deleterious, then one of the pair is very likely
to be further mutated, since another mutation would not bring disadvantages to the
organism. So we suppose that if one in a gene pair has a rather high expression
amount, that gene pair tends to have a larger difference in codon usage bias.
Fig.2 The positive correlation of codon usage bias and its difference of yeast 2 gene family (left) and
all gene family (right). The codon usage bias value here is chosen as the larger one in a gene pair.
Relationship between codon usage bias difference and gene distance
We found that in both 2 gene family and all gene family, the codon usage bias
difference generally increases with the Ks value, which stands for the distance
between the gene pair. This is also a trivial idea, since if the two genes are more
different in its sequence, the more it might changed in its codon usage bias.
Fig.3 The positive relationship of grouped Ks to grouped codon usage bias difference for yeast 2 gene
family (left) and all gene family (right).
Relationship between expression amount and codon usage bias
We examined the relationship between expression data and codon bias data in yeast
big gene families and 2 gene families. It agreed with other observations as the bias
increased with the expression amount. We have also done a test on expression amount
differences and codon usage bias differences, which also exert a positive correlation.
The codon usage bias usually varies with expression amount changes in a gene pair in
positive correlated manner. And if we look at the normalized codon usage bias
difference to expression difference graph, we will found that highly expressed genes
tend to have higher codon usage bias changes while its expression amount changes
then those lower expressed genes.
Fig.4 The positive correlation between codon usage bias and expression amount of yeast big gene
family (upper), 2 gene family (lower left) and all gene family (lower right)
Fig.5 The positive correlation graph for codon usage differences to expression differences of the yeast
2 gene family. Non-normalized data is on the left and normalized data on the right.
Fig.6 The positive relation plot for codon usage differences to expression differences of the yeast all
gene family. Non-normalized data is on the left and normalized data on the right.
Discussion
We generally found that our assumptions are correct. The higher codon usage bias of
one gene in a pair, the higher bias difference might have. And a positive correlation
also lies between expression difference and codon usage bias difference. The “highly
expressed genes have more constrains” prediction has been proven again. We have
also found that the codon usage bias difference has a positive correlation with Ks
values, which means closely related genes have a similar codon usage bias.
Acknowledgement
We especially thanks to Zhenglong Gu, who contributed most to these tests. And we’d
also like to thank to Wen-Hsiung Li, who was very concerned and been supportive to
what we are doing during our visiting.