* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Design Principles in Biology:
Epigenetics of human development wikipedia , lookup
Genetic engineering wikipedia , lookup
Frameshift mutation wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Gene desert wikipedia , lookup
Protein moonlighting wikipedia , lookup
Oncogenomics wikipedia , lookup
Gene therapy wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene nomenclature wikipedia , lookup
Genome (book) wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene expression profiling wikipedia , lookup
Designer baby wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Gene expression programming wikipedia , lookup
Genome evolution wikipedia , lookup
Helitron (biology) wikipedia , lookup
Genome editing wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Microevolution wikipedia , lookup
Evolution & Design Principles in Biology: a consequence of evolution and natural selection Rui Alves University of Lleida [email protected] Course Website:http://web.udl.es/usuaris/pg193845/Bioinformatics_2009/ Part I: Molecular Evolution Theory of Evolution • Evolution is the theory that allows us to understand how organisms came to be how they are •In probabilistic terms, it is likely that all living beings today have originated from a single type of cells •These cells divided and occupied ecological niches, where they adapted to the new environments through natural selection How did the first cell create different cells? Neutral Mutation (e.g. by error in genome replication) How did the first cell create different cells? Neutral Mutation (e.g. by error in genome replication) How did the first cell create different cells? Neutral Mutation (e.g. by error in genome replication) How did the first cell create different cells? Deleterious Mutation (e.g. by error in genome replication) How did the first cell create different cells? Deleterious Mutation (e.g. by error in genome replication) How did the first cell create different cells? Deleterious Mutation (e.g. by error in genome replication) How did the first cell create different cells? Advantageous Mutation (e.g. by error in genome replication) How did the first cell create different cells? Advantageous Mutation (e.g. by error in genome replication) And then there was sex… Why Sex??? • Asexual reproduction is quicker, easier more offspring/individual. • Sex may limit harmful mutations – Asexual: all offspring get all mutations – Sexual: Random distribution of mutations. Those with the most harmful ones tend not to reproduce. • Generate beneficial gene combinations – – – – Adaptation to changing environment Adaptation to all aspects of constant environment Can separate beneficial mutations from harmful ones Sample a larger space of gene combinations What drives cells to adapt? New Niche/ New conditions in old niche What drives cells to adapt? New (better adapted) mutation How do New Genes and Proteins appear? • Genes (Proteins) are build by combining domains • New proteins may appear either by intradomain mutation of by combining existing domains of other proteins Cell Division Cell Division … … The Coalescent •This model of cellular evolution has implications for molecular evolution •Coalescent Theory: •a retrospective model of population genetics that traces all alleles of a gene in a sample from a population to a single ancestral copy shared by all members of the population, known as the most recent common ancestor Why is the coalescent the de facto standard today? Alternatives? Current sequences have evolved from the same original sequence (Coalescent) Current sequences have converged to a similar sequence from multiple origins of life Back of the envelop support for Back of the envelop support for ? divergence ACDEFGHIKLMNPQRSTVWY A EDYAHIKLMNPQRGTVWY AAi AAk AAi AAk AAk AAk Log[ p1] 0 Log[ p 2] 0 Log[ p1] 0 Log[ p 2] 0 AAi ptot p1 p 2 p1 p 2 p1 p 2 20 20 Convergence ptot14 p16 p214 p120 Divergence p 214 p16 Which is more likely? Convergence p114 ()1 Divergence About the mutational process Point mutations: • Transitions (A↔G, C↔T) are more frequent than transversions (all other substitutions) • In mammals, the CpG dinucleotide is frequently mutated to TG or CA (possibly related to the fact that most CpG dinucleotides are methylated at the C-residues) • Microsatellites frequently increase or decrease in size (possibly due to polymerase slippage during replication) Gene and genome duplications (complete or partial), may lead to: • pseudogenes: function-less copies of genes which rapidly accumulate (mostly deleterious) mutations, useful for estimating mutation rates! • new genes after functional diversification Chromosomal rearrangements (inversions and translocation), may lead to • meiotic incompatibilities, speciation Estimated mutation rates: • Human nuclear DNA: 3-5×10-9 per year • Human mitochondrial DNA: 3-5×10-8 per year • RNA and retroviruses: ~10-2 per year Consequences of the coalescent model? So what if we accept the coalescent model? A1-6 TSRISEIRR A7 PSRISEIRR A8-9 PKRISEVRR A10-11 PQRISAIQR A12-13 PQRISTIQR A14 ASHLHNLQR A15-17 TKHLQELQR A18 SKHLHELQR A19 PKNLHELQK A20 SKRLHEVQS A1 TSRISEIRR A2 TSRISEIRR A3 TSRISEIRR A4 TSRISEIRR A5 TSRISEIRR A6 TSRISEIRR A7 PSRISEIRR A8 PKRISEVRR A9 PKRISEVRR A10 PQRISAIQR A11 PQRISAIQR A12 PQRISTIQR A13 PQRISTIQR A14 ASHLHNLQR A15 TKHLQELQRE A16 TKHLQELQRE A17 TKHLQELQRE A18 SKHLHELQRD A19 PKNLHELQKD A20 SKRLHEVQSE So what if we accept the coalescent model? A1-6 TSRI SEI RR A7 PSRI SEI RR A8-9 PKRI SEVRR A10-11 PQRI SAI QR A12-13 PQRI STI QR A14 ASHLHNLQR A15-17 TKHLQELQR A18 SKHLHELQR A19 PKNLHELQK A20 SKRLHEVQS A’1-7 A’10-13 A1-6 A7 A10-11 A12-A13 So what if we accept the coalescent model? A’1-7 (p-t) SRI S E I RR A8-9 P KRI S E VRR A’10-13 P QRI S(a-t)I QR A14 A SHLH N LQR A15-17 T KHLQ E LQR A18 S KHLH E LQR A19 P KNLH E LQK A20 S KRLH E VQS 4 3324 5 323 The study of sequence alignments can gives information about the evolution of the different organisms!!!! Phylogenetic tree reconstruction, overview Computational challenge: There is an enormous number of different topologies even for a relatively small number of sequences: 3 sequences: 1 4 sequences: 3 5 sequences: 15 10 sequences: 2,027,025 20 sequences: 221,643,095,476,699,771,875 Consequence: Most tree construction algorithm are heuristic methods not guaranteed to find the optimal topology. Input data for two major classes of algorithms: 1. Input data distance matrix, examples UPGMA, neighbor-joining 2. Input data multiple alignment: parsimony, maximum likelihood Distance matrix methods use distances computed from pairwise or multiple alignments as input. Building phylogenetic trees of proteins Genome 1 Protein A Genome 2 Protein C Genome 3 Genome … Protein D Protein B Protein A Protein B … Protein C Protein D Protein B Protein D Protein A Protein C Distance based phylogenetic trees A1 A2 A3 … A2 A1 5 substitutions ACTDEEGGGGSRGHI… A-TEEDGGAASRGHI… ACFDDEGGGGSRGHL… … A1 A3 A3 3 substitutions A2 8 substitutions 5 A1 A3 A2 3 Maximum likelihood phylogenetic trees Probability of aa substitution Alignment A - E D … ACTDEEGGGGSRGHI… 0.09 … A-TEEDGGAASRGHI… A 1 0.01 0.2 ACFDDEGGGGSRGHL… - 0.01 1 0.0001 0.0001 … … E 0.2 0.0001 1 0.5 D 0.09 0.0001 0.5 … 1 Maximum likelihood phylogenetic trees A2 Alignment p(1,2) ACTDEEGGGGSRGHI… A-TEEDGGAASRGHI… ACFDDEGGGGSRGHL… … p(1,3) A1 5 substitutions A1 A3 3 substitutions p(2,3)>p(1,2)>p(1,3) A3 A2 A1 p(2,3) A3 A3 A1 A2 A2 8 substitutions Statistical evaluation of trees: bootstrapping 5 1 2 4 6 7 8 3 Motivation: Some branching patterns in a tree may be uncertain for statistical reasons (short sequences, small number of mutational events) Goal of bootstrapping: To assess the statistical robustness for each edge of the tree. Note that each edge divides the leave nodes into two subsets. For instance, edge 7–8 divides the leaves into subsets {1,2,3} and {4,5}.However, is this short edge statistically robust ? Method: Try to generate tree from subsets of input data as follows: • Randomly modify input MSA by eliminating some columns and replacing them by existing ones, This results in duplication of columns. • Compute tree for each modified input MSA. • For each edge of the tree derived from the real MSA, determine the fraction of trees derived from modified MSAs which contain an edge that divides the leaves into the same subsets. This fraction is called the bootstrap value. Edges with low bootstrap values (e.g. <0.9) are considered unreliable. Statistical evaluation of trees: bootstrapping Other Trees • Use genomes • Use Enzymomes • Use whatever group of molecules are important for a given function Part II: Design principles Outline • What are design principles How to study design principles • Examples What are design principles? • Recurrent qualitative or quantitative rules that are observed in similar types of systems as a solution to a given functional problem • Exist at different levels Nuclear Targeting Sequences Operon Gene 1 Gene 2 Gene 3 How can design principles emerge in molecular biology? • Inteligent design? Not a scientific hypothesis; out of the table • Evolution? Makes sense, but how could such regularities emerge? Climbing down mount improbable • Overtime, edged stones would accumulate on the slope. • Smooth, round, stones accumulate at the bottom. Design Principles: - Smooth, roundish rocks roll down the mountain. - Edged, flat, rocks don’t. Design principles in molecular biology • Similarly, if a topology or set of parameters has appeared through mutation and it can be shown to create a molecular network that functionally outperforms all other possible alternatives in a given set of conditions, one can talk about a design principle for the system under those conditions. [sensu engineering] Index of talk • How to identify design principles • Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development • Design principles, what are they good for? • Summary First step, define the alternatives Regulator Regulator _ + Gene Gene X0 X1 X2 X3 X0 X1 X2 X3 First step, define the alternatives X3 t How strong should the feedback be? X0 X1 X2 X3 Then, create models for each alternative Regulator Regulator _ + Gene Gene Finally: • Compare the dynamic behavior of the models for the two or more alternatives with respect to physiologically relevant criteria. Then, create models for each alternative X0 X0 X1 X1 X2 X2 X3 X3 Index of talk • How to identify design principles • Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development • Design principles, what are they good for? • Summary The demand theory for gene expression Regulator Regulator _ + Gene Gene • Are there situations where positive regulation of gene expression outperforms negative regulation of gene expression and vice versa? Regulating gene expression has principles Regulator Regulator _ + Gene Gene • Positive regulator: – More effective when gene product in demand for large fraction of life cycle. – Less noise sensitive if signal is low. • Negative regulator: – More effective when gene product in demand for small fraction of life cycle. – Less noise sensitive if signal is high. Genetics 149:1665; PNAS 103:3999; PNAS 104:7151;Nature 405: 590 Index of talk • How to identify design principles • Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development • Design principles, what are they good for? • Summary Negative overall feedback is a design principle in metabolic biosynthesis X0 X1 X2 X3 • Negative overall feedback: – More effective in coupling production to demand. – More robust to fluctuations. Bioinformatics 16:786; Biophysical J. 79:2290 Index of talk • How to identify design principles • Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development • Design principles, what are they good for? • Summary Bifunctional sensors can be a design principle in signal transduction Signal Sensor Effect Efector Efector Deactivator • Bifunctional sensor: – Performs best against cross talk • Independent deactivator: – Better integrator of signals Mol. Microbiol. 48:25; Mol. Microbiol. 68: 1196 Index of talk • How to identify design principles • Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development • Design principles, what are they good for? • Summary Design principles in development High demand, low signal Signal Signal + _ Regulator Low demand, low signal _ + High demand, high signal Gene Low demand, high signal Genetics 149:1665; PNAS 103:3999; PNAS 104:7151;Nature 405: 590 Index of talk • How to identify design principles • Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development • Design principles, what are they good for? • Summary Biological design principles are good to understand why biology works as it does Growth rate Heat shock Expression of important genes • Biological design principles may connect molecular determinants to functional effectiveness. time BMC Bioinformatics 7:184 time Underlying assumption • Evolution of molecular networks can be treated as modules. • Work in the group of Uri Alon suggests that – networks evolving to meet simultaneous goals evolve in a modular fashion – Networks evolving to meet a single goal evolve globally • Modularity seems like a reasonable first assumption PNAS 102:13773; PLOS Comp Biol 4:e1000206;BMC Evol biol 7: 169 The good news about function • Sometimes, you get stuff for free!!! • For example: – networks that are responsive to signals, just because they are responsive may have inbuilt buffering of noise. – Functions that are associated with marginally stable proteins are favored because due to the large dimensions of sequence space most randomly selected sequences have a structure that is marginally stable. PNAS 100:14463; PNAS 103:6435; Proteins 46:105 How can biological design principles be applied? • Design of molecular circuits with specific behaviors!! Bistable systems Stable Systems Oscilations Cell 113: 597; PLoS Comput Biol. 5:e1000319; PNAS 106: 6435 Unstable systems Index of talk • How to identify design principles • Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development • Design principles, what are they good for? • Summary Summary • Design principles can be found in molecular networks. • Such principles can sometimes be connected to selection for function effectiveness. • Even in the absence of such a connection, if they are valid they can be used to build biological circuits with specific behaviors.