* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Genome-wide Regulatory Complexity in Yeast Promoters
Biology and consumer behaviour wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Genomic imprinting wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Genomic library wikipedia , lookup
Genome (book) wikipedia , lookup
Designer baby wikipedia , lookup
Population genetics wikipedia , lookup
Point mutation wikipedia , lookup
Human genome wikipedia , lookup
Gene desert wikipedia , lookup
Gene expression profiling wikipedia , lookup
Ridge (biology) wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Non-coding DNA wikipedia , lookup
Pathogenomics wikipedia , lookup
Metagenomics wikipedia , lookup
Minimal genome wikipedia , lookup
Genome editing wikipedia , lookup
Helitron (biology) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genome-wide Regulatory Complexity in Yeast Promoters Zhu YANG 15th Mar, 2006 Reference • C. S. Chin, J. H. Chuang, & H. Li. 2005. Genome-wide regulatory complexity in yeast promoters: Separation of functionally conserved and neutral sequence. Genome Research. 15(2):20513. Outline • • • • Purposes Methods Results Discussion Purposes • To separate functionally conserved and neutral sequence. • To know how much promoter sequence is functional. Methods • Determine the local neutral mutation rates by measuring the degree of sequence conservation across the genome • Determine what parts of yeast promoters evolve neutrally • Estimate the total amount of promoter sequence under selection in promoters. • Find out how much regulation acts on each gene roughly by analyzing the length of sequence in high conservation regions for each promoter. Algorithms • Calculation of substitution rates from fourfold sites • Mutational uniformity • Separation of high and low conserved regions with a hidden Markov model • Genome-wide percentage of promoter sites under selection • z-score in Gene Ontology analysis Neutral mutation rates are uniform genome-wide • Mutation rates are uncorrelated along the yeast genome • In contrast, mouse-human conservation rates are significantly correlated along the human genome at separations up to several megabases Autocorrelation in conservation rates Neutral mutation rates are uniform genome-wide (Cont’d) • There is a subset of genes was biased toward high conservation by some secondary effect • There are 92% of the genes mutate neutrally at fourfold degenerate sites. The high conservation values for the remaining 8% of the genes were explainable by codon usage selection • correlation of the normalized substitution rate with codon adaptation index (CAI) was 0.67. Distribution of normalized conservation rates Neutral conservation rates in promoters • Functional elements should be separated from the neutral background, since conservation can be due to shared ancestry. • Hidden Markov model (HMM) • Break the promoters into high conservation regions (HCR) and low conservation regions (LCR). • the HCRs and LCRs gave a good approximation to functional and neutral regions. Separation of conserved blocks from the background Neutral conservation rates in promoters (Cont’d) • The HCRs, on the other hand, contained an excess of functional elements. • While the HCRs covered only 34.3% of the promoter regions, they contained 71.6% motifs in the promoters. • The neutral rates in the LCRs were consistent with the neutral rates obtained from the fourfold site analysis Distribution of the conservation rate for promoter sequences Genome-wide amount of promoter sequence under selection • Frequency of Conserved Blocks (FCB) method was more robust than the HMM for inferring the amount of selectively conserved sequence • Count the numbers of blocks of n consecutive conserved bases in the promoter sequences, which were then compared to neutral expectations. Requirements • The frequency distribution of conserved blocks in neutral sequence is known • This neutral component can be extracted from the real frequency distribution. Distribution of the counts of blocks of n consecutive conserved bases Estimate of the percentage of sites evolving neutrally among various species Gene-specific selection in promoters • The HCRs provide a rough characterization of the transcriptional regulation in each promoter. • most genes having 15%–25% of their promoter sequence in HCRs. • Protein sequence conservation was correlated on a gene-by-gene basis with HCR length The Gene Ontology terms • With the largest HCR length biases were those involved in the energy generation and steroid synthesis pathways, suggesting that these types of genes have unusually complex regulation. • The genes with the strongest protein sequence conservation were not always those having the longest HCR lengths, Catalysis, Basic Biosynthesis, and Ribosomal Genes, for example. Nonsynonymous conservation versus lengths of HCR Discussion • The neutral conservation rate is uniform across yeast genomes. One nonselective possibility is that yeast chromosomes are too short to have heterogeneity in their mutational environment • A significant fraction of promoter sequence was under purifying selection. • A typical function block may contain one or two protein-binding sites; an upper bound of ∼10 transcription-factor-binding sites in a promoter. • Genes involved in energy generation and steroid synthesis may be subject to complex transcriptional regulation.