* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download poster-sbbq
X-inactivation wikipedia , lookup
History of genetic engineering wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
Gene desert wikipedia , lookup
Epigenetics of cocaine addiction wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genome (book) wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Genome evolution wikipedia , lookup
Metagenomics wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Microevolution wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Genomic imprinting wikipedia , lookup
Designer baby wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gene expression programming wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Gene expression profiling wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Toward the Identification of a Gene Expression Framework in Different Types of Tissues and Organisms Saulo Augusto de Paula Pinto1, 2 {[email protected]} 1 Ortega J. Miguel {[email protected]} 2Instituto 1Laboratório de Informática PUC MINAS BARREIRO de Biodados Departamento de Bioquímica e Imunologia Instituto de Ciências Biológicas – UFMG 1 Introduction In order to identify a possible common framework of gene expression in samples of gene expression data, 418 samples that compose 13 NCBI-GEO series generated on the top of Affymetrix GeneChips platform and 31 SAGE Genie libraries were analyzed. 2 Methodology Some results are shown for two data series: one of 36 human normal tissues samples and one of 11 A. thaliana tissues (GEO accessions: GSE2361, GSE607). It was found that the expression sorting is kept in such a way that weak framework rate between a pair of samples can be used even to cluster a set of gene expression data samples. INPUT: a set of samples sorted by the gene expression such that the most expressed gene (sequence) is in the position 0 and the least expressed is in position N-1, where N is the number of genes (sequences) in each sample. OUTPUT: a list of pairs of genes (sequences) in which the first member keeps its expression order relative to the second member in every sample. 1) Chose a reference sample only to build pairs; 2) For each pair of genes [GC, GL] in the reference sample where GC is more expressed than GL do 2.1) If GC is more expressed than GL in every sample then include the pair [GC, GL] in the weak framework; else discard the pair. An algorithm to find out a weak framework: one that is composed by pairs of genes in which the first element of the pair is always more expressed than the second one in every analyzed sample. 3 Results Every sample from different organisms follows a exponential-like decay as the expression values diminish, disregarding the technology, the number of distinct sequences in the samples, the organism or tissue kind. This finding suggests that the sorting of gene expression and not only the genes expressed has a determinant role in the tissues or organism character. Highly physiologically-related tissue pairs like [amygdala, hippocampus] and [prostate, bladder] or sample replicates like [leaf_gh1, leaf_gh2] have as high as 94.7%, 89.7%, and 94.12% of their sequences pairs conserved, respectively. Amygdala Caudate Nucleus hippocampus hippocampus 94.7 S Muscle 91.2 Cerebellum Small Intestine Liver Amygdala Amygdala Corpus hippocampus Caudate Nucleus Prostate Brain Brain Brain Ovary Thalamus Caudate Nucleus Spinal Cord Thalamus Thalamus Bladder Thalamus Amygdala hippocampus Bladder 91.1 91.0 90.7 90.4 89.8 89.7 89.6 89.5 89.1 88.5 Small Intestine S Muscle Heart Pancreas Heart Caudate Nucleus S Muscle Salivary Gland Small Intestine Bone Marrow Fetal Brain Caudate Nucleus Salivary Gland Caudate Nucleus Liver Fetal Liver Liver Caudate Nucleus Caudate Nucleus Thalamus 21.5 21.3 20. 8 20.4 20.1 19.6 19.3 19.1 18.9 18.8 18.6 18.5 Small Intestine Thalamus Bladder Caudate Nucleus Caudate Nucleus Cerebellum Bladder Bladder Colon Spinal Cord Breast Spinal Cord Corpus Brain Lung Trachea 88.2 88.0 87.9 87.7 87.7 87.5 87.3 87.2 Salivary Gland Fetal Brain Salivary Gland Caudate Nucleus Bone Marrow S Muscle Heart Bone Marrow Liver Liver Bone Marrow Liver Liver Bone Marrow Bone Marrow Caudate Nucleus 18.1 17.7 17.4 16.4 15.0 14.1 12.8 12.1 On the other side, in H. sapiens pairs composed of different tissues like those involving bone marrow, liver and the central nervous system tissues keep expression sorting poorly (< 22%). 4 Discussion Considering all 36 H. sapiens tissues together, 28.5% of the 3,064,841 possible pairs were conserved. For A. thaliana stem and flower conserved least (< 47%) and the 11 samples conserved 55.45% (22892007 of 41286376), as expected to a less complex organism with less diversity of tissues. 5 Conclusions The results point to the existence of a gene expression framework of genes that keep their expression sorting through a vast different set of tissues. Part of a weak framework found for 36 human normal tissues samples considering only the 20 most expressed sequences (MESs) from each sample. A directed edge indicates the gene that is most expressed (source) and the least expressed (target). Support: FAPEMIG