* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download ppt
Promoter (genetics) wikipedia , lookup
Ridge (biology) wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Molecular ecology wikipedia , lookup
Epitranscriptome wikipedia , lookup
Point mutation wikipedia , lookup
Gene expression wikipedia , lookup
Gene regulatory network wikipedia , lookup
RNA silencing wikipedia , lookup
Non-coding DNA wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Genome evolution: a sequence-centric approach Lecture 13: epistasis: RNA, enhancers, networks (Probability, Calculus/Matrix theory, some graph theory, some statistics) Simple Tree Models HMMs and variants PhyloHMM,DBN Context-aware MM Factor Graphs DP Sampling Variational apx. LBP EM Generalized EM (optimize free energy) Probabilistic models Genome structure Inference Mutations Parameter estimation Population Tree of life Genome Size Elements of genome structure Elements of genomic information Models for populations Drift Selection and fixation Draft Protein coding genes Inferring Selection Today refs: Papers cited TFBSs Epistasis Assume we have two loci, each bearing two alleles (Aa and Bb) Assume that the basal state of the population is homogenous with alleles ab f(A) - The relative fitness of A is defined using the growth rate of the genome Ab f(B) - The relative fitness of B is defined using the growth rate of the genome aB What is the fitness of AB? If the two loci are unrelated, we can expect it to be: f(Ab)*f(aB) When f(A)=1+s, f(B)=1+s’, and s,s’ are small, f(A)*f(B)~(1+s+s’) Epistasis is defined as the deviation from such linearity/independence: f(AB) > f(Ab)*f(aB): synergistic loci f(AB) < f(Ab)*f(aB): antagonistic loci A + AB B A - B AB How widespread is epistasis? Is it positive or negative in general? and how it affect evolution in general? Testing epistasis in viruses: directed mutagenesis 47 genotypes of vesicular stomatitis virus carrying pairs of nucleotide substitution mutations (filled) 15 genotypes carrying pairs of beneficial mutations (empty circles) Sanjuan, PNAS 2004 Testing epistasis in viruses: HIV-1 isolated drug resistant strains Comparing growth in drug-free media (extracting viral sequence and reintegrating it in a virus model) Sequencing strains, comparing to some standard Plotting fitness relative to the number of mutations: For each pair of loci, compute average fitness for aa,aB,Aa and BB, then estimate epistasis. To assess significance, recompute the same after shuffling the sequences Mean is significantly higher than randomized means Effect is stronger when analysis is restricted to 59 loci with significant effect on fitness Results suggesting that epistasis tends to be positive (at least in these viruses and in this condition) Bonhoeffer et al, science 2004 Functional sources for epistasis: • Protein structure (interacting residues) • Different positions in the same TFBS • Two interacting TFBSs • TF DNA binding domain and its target site • Two competing enzymes • Two competing TFBS • RNA paired bases • Groups of TFBSs at co-regulated promoters RNA folds and the function of RNA moelcules •RNA molecular perform a wide variety of functions in the cell •They differ in length and class, from very short miRNA to much longer rRNA or other structural RNAs. •They are all affected strongly by base-pairing – which make their structural mostly planar (with many exceptions!!) and relatively easy to model Simple RNA folding energy: number of matching basepairs or sum over basepairing weights More complex energy (following Zucker): each feature have an empirically determined parameters stem stacking energy (adding a pair to a stem) bulge loop length interior loop length hairpin loop length dangling nucleotides and so on. Pseudoknots (breaking of the basepairing hierarchy) are typically forbidden: Predicting fold structure Due to the hierarchical nature of the structure (assuming no pseudoknots), the situation can be analyzed efficiently using dynamic programming. We usually cannot be certain that there is a single, optimal fold, especially if we are not at all sure we are looking at a functional RNA. It would be better to have posterior probabilities for basepairing given the data and an energy model… This can be achieved using a generalization of HMM called Stochastic Context Free Grammar (SCFG) EvoFold: considering base-pairing as part of the evolutionary model Once base-pairing is predicted, the evolutionary model works with pairs instead of single nucleotides. By neglecting genomic context effects, this give rise to a simple-tree model and is easy to solve. If we want to simultaneously consider many possible base pairings, things are becoming more complicated. An exact algorithm that find the best alignment given the fold structure is very expensive (n^5) even when using base pairing scores and two sequences. Pedersen PloS CB 2006 EvoFold: considering base-pairing as part of the evolutionary model Whenever we discover compensatory mutations, the prediction of a functional RNA becomes much stronger. Evolution of a regulatory module: eve stripe 2 in D. melanogaster and D. pseudoobscura mel While the two enhancers drive a conserved expression patter, we cannot mix and match them between species! Evolution therefore continuously compensate for changes in one part with changes in the other. pseudo Ludwig, Kreitmen 2000 Evolution of a regulatory module Eve staining in 4 species Orthologous stripe 2 enhancer reporters in a melanogaster embryo D. Melanogaster D. Yakuba D. Erecta The D. Erecta S2E is forming much weaker stripe in D. Mel. D. Pseudoobscura Ludwig,..,Kreitmen 2005 Sequence conservation and divergence in eve stripe 2 and around it D. Melanogaster Enhancer functional in mel. D. Yakuba Enhancer not functional in mel. D. Erecta Enhancer functional in mel. D. Pseudoobscura Coregulation: epistasis of transcriptional modules • • • Transcriptional modules are crucial for the organization and function of biological system Gene co-regulation give rise to major epistatic relations among regulatory loci epistasis reduces evolvability Co-regulation Is advantageous Disruption of regulation Is deleterious Rugged evolutionary landscape Regulation Scheme 1 Regulation Scheme 2 S phase S. cerevisiae S. cerevisiae Ribosomal Proteins Ribosome biogenesis 45 genes P<10-56 S. Pombe 7 genes P<10-9 S. pombe S. cerevisiae Amino acid met. 114 genes P<10-151 S. Pombe 32 genes P<10-29 S. Pombe S. cerevisiae Cis-elements underlying conserved TMs Putative Orthologous Module (POM) S. cerevisiae S. bayanus S. castellii C. glabrata S. kluyverii K. waltii K. lactis A. gossypii D. hansenii C.albicans Y. lypolitica N. crassa A. nidulans S. pombe Phylogenetic cis-profiling with 17 yeast species S phase Respiration Amino acid metabolism Conserved cis-elements MCB S. cerevisiae S. paradoxus S. mikatae •Conserved FM are sometime regulated by remarkably conserved cis elements S. kudriavzevii S. bayanus S. castellii C. galbrata S. kluyveri K. waltii •Conserved cis elements are bounded by conserved TFs K. lactis A. gossypii D. hansenii C. albicans Y. lipolytica N. crassa A. nidulans S. pombe Tanay et al. PNAS, 2005 HAP2345 GCN4 Ribosomal Protein Module: Evolutionary change via redundancy Redundant mechanism Rap1 emergence Homol-D loss S. cerevisiae (133) 112 38 S. parad. (75) 46 31 S. mikatae (88) 57 46 S. kudriavz .(94) 48 40 S. bayanus (118) 54 40 S. castellii (89) 53 45 40 C. glabrata (69) 29 21 45 S. kluyveri (61) 30 29 32 K. waltii (54) 34 31 30 K. lactis (75) 35 A. gossypii (73) 64 D. hansenii (73) 17 52 41 RAP1 Homol-D IFHL Y. lipolytica (70) 46 30 C. albicans (41) Homol-D based 32 51 53 N. crassa (67) 46 A. nidulans (72) 49 S. pombe (74) 73 44 Rap1 evolution in trans S. cerevisiae S. castelii New TA domain Co-emerged with Rap1 role in RP regulation K. waltii A. gossypii C. albicans N. crassa A. nidulans S. pombe H. sapiens BCRT Myb Silencing TA Redundant cis-elements are spatially clustered: RP genes in A. gossypii 5’ 3’ 6bp Homol-D RAP1 Evolution of the IFHL element Drift… Reverse complement duplication sacc. et al. hansenii albicans lypolityca crassa Conservation Tandem duplication nidulans pombe S. cerevisiae (225) S. parad. (215) S. mikatae (187) Evolution of the Ribosomal biogenesis module S. kudri. (196) S. bayanus (195) S. castellii (204) C. glabrata (214) 157 187 175 159 136 151 152 163 151 159 152 167 180 166 137 157 163 181 59 122 200 145 122 171 163 126 152 110 S. kluyveri (178) K. Waltii (230) K. lactis (225) A. gossypii (226) D. hansenii (219) C. albicans (214) Y. lipolytica (208) RRPE PAC TC N. crassa (193) 51 154 132 ? A. Nidulans (187) S. pombe (196) 159 99 83 79 a, S. cerevisiae and C. albicans transcribe their genes according to one of three programs, which produce the a-, - and a/ -cells. The particular cell type produced is determined by the MAT locus, which encodes sequence-specific DNA-binding proteins. In S. cerevisiae, a-type mating is repressed in a-cells by a2. In C. albicans, a-type mating is activated in a-cells by a2. In both species, a-cells mate with a-cells to form a/a -cells, which cannot mate. a2 is an activator of a-type mating over a broad phylogenetic range of yeasts. In S. cerevisiae and close relatives, a2 is missing and a2 has taken over regulation of the type. Mating genes a2 Albicans a2 Cerevisiae Tsong et al. 2006 A transition of motifs is observed between Cerevisiae and albicans Innovation in a2 is observed along with the emergence of possible mcm2 interaction A redundant intermediate may have enable the switch Phenotypic innovation through regulatory adaptation After S. Carroll After S. Carroll Rockman, Plos Biol, 2005 Ihmels Science, 2005 481 segment longer than 200bp that are absolutely conserved between human, mouse and rat (Bejerano et al 2005) What are these elements doing? Why they are completely conserved? 4 Knockouts are not revealing significant phenotypes.. Ahituv et al. PloS Biolg 2007 Population genetics do suggest ultraconserved elements are under selection Separating mutational effects from selective effect is still a challenge… Katzman et al., Science 2007