Download Slide 1

Biology in Computation and Computation in Biology Molecular Computation of Solutions to Combinatorial Problems, Leonard M.Adleman, Science 1994 Imposing specificity by localization: mechanism and evolvability, Mark Ptashne and Alexander Gann, Current Biology 1998 Note: use to follow links in the presentation What kind on computations can be done with DNA? Molecular Computation of Solutions to Combinatorial Problems DNA complementary The directed Hamiltonian path problem A directed graph G with vertices vin and vout is said to have a Hamiltonian path if and only if there exist a sequence of “one-way” edges e1, e2... en (that is, a path) that begins at vin and ends vout and enters every other vertex exactly once. 4 g vin 3 e 1f vin vout 0 c 6 d 2 a 5 b vout No Hamiltonian path45, 56 01, 12, 23, 34, The directed Hamiltonian path problem A directed graph G with vertices vin and vout is said to have a Hamiltonian path if and only if there exist a sequence of “one-way” edges e1, e2... en (that is, a path) that begins at vin and ends vout and enters every other vertex exactly once. There A particular is no known case efficient of the Hamiltonian algorithmpath/circuit for finding is a the 4 Traveling Hamiltonian Salesman path/circuit. problem 3 where a salesman wants to visit 1 vin vout 0 The n cities fastest via known the shortest algorithms route take exponential time. 6 In general, this is an NP-complete problem. 2 5 Solving the Hamiltonian path problem An algorithm solving the Hamiltonian path problem: Step 1: Generate random paths through the graph Step 2: Keep only those paths that begin with vin and end with vout Step 3: If the graph has n vertices, then keep only those paths that enter exactly n vertices Step 4: keep only paths that enter all of the vertices of the graph at least once 4 1 3 Step 5: If any paths remain, say “yes”; otherwise, say “no”. vin vout 0 6 2 Note: use to follow links in the presentation 5 Solving the Hamiltonian path problem 4 Implementing the algorithm at the molecular level: 1 3 vin vout 0 6 Drawbacks: 2 5 This computation requires ~7 days of lab work Possible errors – for example: - “pseudopaths” caused by incompatible ligation  unlikely to survive all the separation steps confirm that the Hamiltonian path received actually occurs in the graph - Inexact reactions such as: Loss of Hamiltonian path molecules that failed to bind and retention of non-Hamiltonian path molecules that succeeded to bind  more stringent or repeated separation procedures Solving the Hamiltonian path problem 4 Implementing the algorithm at the molecular level: 1 3 vin vout 0 6 Advantages: 2 5 With the described algorithm, the number of procedures grows linearly with the number of vertices in the graph. O(n) The number of oligonucleotides grows linearly with the number of edges. Supercomputers vs DNA computation - 1012 op/sec vs 1014 op/sec - 109 op/J vs 1019 op/J (in the ligation step) - 1 bit per 1012 nm3 vs 1 bit per 1 nm3 (video tape vs. DNA molecules) Solving the Hamiltonian path problem 4 Implementing the algorithm at the molecular level: 1 3 vin vout 0 6 “For certain intrinsically complex problems, 2 5 such as the directed Hamiltonian path problem where existing electronic computers are very inefficient and where massively parallel searches can be organized to take advantage of the operations that molecular biology currently provides, it is conceivable that molecular computation might compete with electronic computation in the near term” DNA self-assembled nanostructures DNA computers DNA Nanotechnology and its Biological Applications, Chapter 13 of Book: Bio-inspired and Nano-scale Integrated Computing, Publisher: Wiley, USA, (2007). Going into the biological system… Biology in Computation Molecular computation of solutions to combinatorial problems Computation in Biology combinatorial computation within molecular biologic systems Imposing specificity by localization: mechanism and evolvability, Mark Ptashne and Alexander Gann, Current Biology 1998 Specificty by localization Input Signal A C signal Machine Output signal Machine Output Output A C Input 1 3 Input 1 Input 2 Input 3 Input 4 How is specificity encoded? Specificty Transcription by localization regulation We have a powerful machine that can bring the “instructions” to life Signal C Machine RNA Enzyme pol mRNA Outputof C gene 3 Gene Input 3 Gene product Input Input Input How does it know which instructions should be performed 1 2 4 at any given time? How is specificity encoded? Transcription regulation Input signal Allosteric change of a target protein • Activation of transcription factors • Inactivation of transcription factors These transcription factors then serve as “locators” Transcription regulation A typical activator has 2 domains: 1. An ‘activating domain’ – that interacts with RNA polymerase 2. A DNA binding domain RNA pol Activator DNA The specificity is thus determined by the binding of the activator to a site – a DNA binding address - on one/several promoters. Similarly, a typical repressor binds to specific sites on a promoter and blocks the polymerase from accessing these regions Transcription regulation RNA pol activator RNA pol repressor ON OFF Once the RNA polymerase is brought to a specific promoter the transcription proceeds spontaneously Binding sites combinatorics Modulating the binding function (and thus the expression function) – Weak sites versus Strong sites Cooperativity (synergism) in DNA binding: - Between an activator and the polymerase – enhanced recruitment - Between 2 activators - fine tuning the function - Cooperativity via nucleosomes - Phage lambda’s sensitive switch Combining signals – creating an AND gate: - Sugar metabolism genes in E.coli - Human interferon- gene – combinatorics in Eukaryotes Cooperativity (synergism) in polymerase activation: - Via multiple sites - Via multiple components in the initiation machinery Note: use to follow links in the presentation The main players Specificity by localization Why is the strategy of imposing specificity by localization found so widely in nature? Let’s consider an alternative method: The same enzyme can be used in many different pathways — Determining purely work by allosteric control with many different regulators. This requiresspecificity that the enzyme in combination This would require a separate RNA polymerase for each promoter – the integration of the relevant signals will induce an allosteric transition in the appropriate polymerase – triggering transcription. However, designing such a variaty of polymerases seems quite difficult… It is hard to imagine how a purely allosteric based “implementation” can posses a flexible and sensitive combinatorial control as the one achieved by the strategy of localization Biology in Computation and Computation in Biology Molecular Computation of Solutions to Combinatorial Problems, Leonard M.Adleman, Science 1994 Imposing specificity by localization: mechanism and evolvability, Mark Ptashne and Alexander Gann, Current Biology 1998 The End… Solving the Hamiltonian path problem 4 Implementing the algorithm at the molecular level: Step 1: Generate random paths through the graph 1 3 vin vout 0 6 2 Vertex i Oi = 5 A random 20-mer sequence of DNA, denoted Oi 5’ 3’ A C A T G A G C T G G G T A C G A A T T Watson-Crick complementary Edge ij Oi = T G T A C AnToligonucleotide C G A C consisitng C C A of: T G C T T A A A 3’C10-mer G Aof O Ai followed T T Aby T C C of Goj A T T A Oij = G G T the the C 5’ 10-mer (if i=0 then oij = Oi, if j=6 then oij = oj) Vertex j Oj = A T C C C G A T T A T G T C A G A C G G Solving the Hamiltonian path problem 4 Implementing the algorithm at the molecular level: Step 1: Generate random paths through the graph 1 3 vin vout 0 6 2 5 For each vertex and for each edge graph, The scale(except of thisi=0,6) ligation >>>> whatinisthe necessary for this graph 50 pmol oi edge and 50 of oij were13mixed together in a single molecule liigation reaction For of each in pmol the graph, ~ 3X10 copies of the associated were added to the ligation reaction Oj many DNA molecules encoding the Hamiltonian path were created G G T A C G A A T T A T C C C G A T T A T G T A C T C G A C C C A T G C T T A A T A G G G C T A A T A C A G T C T G C C It seems a much larger graph could have been processed with the quantities used here. Oij Ojk The ligation reaction results in the formation of DNA molecules encoding random paths through the graph Solving the Hamiltonian path problem 4 Implementing the algorithm at the molecular level: Step 2: Keep only those paths that begin with vin and end with vout The product of step 1 Selective amplification by PCR with primers o0 and o6 Only those molecules encoding paths that begin with vertex 0 and end with vertex 6 were amplified 1 3 vin vout 0 6 2 5 Solving the Hamiltonian path problem 4 Implementing the algorithm at the molecular level: Step 3: Keep only those paths that enter exactly n vertices The product of step 2 Run on agarose gel and extract 140bp bands Only those molecules encoding paths that enter exactly 7 vertices were extracted and amplified 1 3 vin vout 0 6 2 5 Solving the Hamiltonian path problem 4 Implementing the algorithm at the molecular level: Step 4: keep only those paths that enter all of the vertices at least once The product of step 3 Generating single stranded DNA incubating the DNA with o1 Repeat with conjugated to magnetic beads o2, o3, o4, o5 Only molecules that containing o1 annealed to the bound o1 and were retained Only molecules that entered vertices 1, 2, 3, 4 and 5 were retained 1 3 vin vout 0 6 2 5 Solving the Hamiltonian path problem 4 Implementing the algorithm at the molecular level: Step 5: If any paths remain, say “yes”; otherwise, say “no” The product of step 4 1 3 vin vout 0 6 2 5 For the molecules encoding the Hamiltonian path: Graduated PCR – 01, 12, 23, 34, 45, 56 A method for “printing” results by running different PCR reactions each with O0 as the right primer and Oi as the left primer Identifying the Hamiltonian path this method will produce bands of 40, 60, 80, 100, 120 and 140bp in successive lanes Weak site Vs Strong site A protein recognizes different sequences with different affinities – A likely situation 1 Depending on the 2 factor concentration TF binding prob. (Pbound) 1 0.8 0.6 0.4 0.2 0 -3 10 -2 10 -1 10 0 10 TF concentration 1 10 2 10 Cooperativity in DNA binding Two DNA binding proteins 1 2 The sites are filled in a highly sigmodial 3 function of the protein concentration TF binding prob. (Pbound) 1 • Confers buffer against minor fluctuations 0.8 in the protein concentration 0.6 • Confers the ability for a dramatic change 0.4 when a significant proportion of the 0.2 protein is activated / inactivated at once 0 -3 10 -2 10 -1 10 0 10 TF concentration 1 10 2 10 Cooperativity in DNA binding If activation Anothermerely possile involves form oflocating cooperativity the transcription between transcription machinery at factors the gene – any factors that inhibit or facilitate that relocation process can have an effect cooperativity Via nucleosomes (and not protein protein interatcion) on gene expression Such a factor – are nucleosomes… 1 2 activator Nucleosome Nucleosome activator activator 3 4 TF binding prob. (Pbound) 1 0.8 0.6 RNA pol OFF ON 0.4 0.2 0 -4 10 -3 10 -2 10 -1 10 TF concentration See works from Jon Widom lab 0 10 1 10 Phage lambda’s sensitive switch Inducting signal Lysogenic state The bacterial genes, within a host E.coli, are in a silent state Lytic state The bacterial genes, within a host E.coli, are active PRM = promoter controlling the repressor gene PR = promoter controlling the lytic genes An “all-or-none” switch implemented by two adjacent promoters – when one is “on” the other is “off”! The main players Phage lambda’s sensitive switch Induction signal Lysogenic state Lytic state Lambda repressor Repressor dimer at OR2 recruits the polymerase Two Repressor dimers at OR1 and OR2 PRM = promoter controlling the repressor gene OFF ON PR = promoter controlling the lytic genes OFF ON Cro repressor Phage lambda’s sensitive switch Switch properties: 1 Protein-protein interaction: 2 3 1. repressor dimerization 2. Cooperative interaction of repressor dimers 3. Cooperative binding of RNA polymerase and the activator (lambda repressor to PRM promoter) The surfaces involved in these interactions are interchangeable – An example of an “activator bypass” experiment Phage lambda’s sensitive switch Switch properties: Both the protein-protein and the binding interactions are relatively weak interactions The cooperative nature of the The components are interaction is necessary for the maintained in a relatively performance of the switch narrow range of concentrations 1 2 3 Sugar metabolism genes in E.coli The genes are transcribed if and only if: 1. Absence of glucose 2. The relevant sugar is present Let’s take a closer look at the Lac genes: AND gate Expression of alternative sugar genes Sugar metabolism genes in E.coli Let’s take a closer look at the Lac genes: Low glucose High cAMP High lactose signal A metabolic derivative of lactose binds the lac repressor CAP CAP-cAMP complex Lac repressor Inactive Lac repressor cannot bind the DNA Allosteric change CAP-cAMP complex binds the DNA Localization Interpretation of the signal at the DNA binding level Information processing Synergism in polymerase activation The level of transcription elicited by contact 1 The level of transcription elicited by contact 2 The level of transcription elicited by the two contacts One such example: Measuring expression from an artificial PRM promoter construct The construct contains: *It CAP sitethe factors contact the seems * lambda repressor site polymerase simultaneously Thedifferent sites aresubunits) positioned so that (at each of the can makeresponse its resulting in factors an a synergistic natural contact with polymerase Joung JK, Koepp DM, Hochschild A: Synergistic activation of transcription by bacteriophage l cl protein and E. coli cAMP receptor protein. Science 1994 lacZ PRM The main players… Prokaryotes Vs. Eukaryotes The prokaryotes are a group of organisms, mostly unicellular, that lack a cell nucleus or any other membrane-bound organelles. Animals, plants, fungi, and protists are eukaryotes - organisms whose cells are organized into complex structures enclosed within membranes. The distinction between prokaryotes and eukaryotes is that eukaryotes have a "true" nuclei containing their DNA, whereas the genetic material in prokaryotes is not membrane-bound. Escherichia coli phage λ Prokaryotes Bacteria The main players… A bacteriophage is any one of a number of viruses that infect bacteria Enterobacteria phage λ (lambda phage) – A temperate bacteriophage that infects Escherichia coli. The lysogenic pathway: Lambda lytic phage pathway: is a virus particle consisting of a head, The DNA integrates into theas host cell chromosome It willphage containing replicates double-stranded its DNA, itself linear DNA its genetic material, In this state, the λ DNA is called a prophage and its stays resident within the host's degrades and a tail –the through host DNA which and it injects its DNA into host. genome without harm to the host.and translation mechanisms hijacks the cell's apparent replication, transcription The prophage is duplicated with every cellallow. division of the host. to produce as many phage particles assubsequent cell resources The phage genes expressed in this dormant state code for proteins that repress expression of other phage genes. the phage will lyse (break open) the host cell, When cell resources are depleted, releasing the new phage particles. When the host cell is under stress - these proteins are broken down  resulting in the expression of the repressed phage genes. The activated prophage then enters its lytic pathway. NP-complete An important aspect of the Computational complexity theory is to categorize computational problems and algorithms into complexity classes Complexity classes: • P - the set of decision problems that can be solved by a deterministic machine in polynomial time. • NP - the set of decision problems that can be solved by a non-deterministic machine in polynomial time. The solution for all the problems in this class can be verified in polynomial time ? The most important open question of complexity theory is whether P = NP • NP-complete is a subset of NP - A decision problem X is NP-complete if : - X is in NP - Every problem in NP is reducible to x (every other problem in NP can be quickly transformed into x) Although any given solution to such a problem can be verified quickly, there is no known efficient way to locate a solution in the first place; indeed, the most notable characteristic of NP-complete problems is that no fast solution to them is known. That is, the time required to solve the problem using any currently known algorithm increases very quickly as the size of the problem grows. As a result, the time required to solve even moderately large versions of many of these problems easily reaches into the billions or trillions of years, using any amount of computing power available today.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Slide 1