Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DNA repair protein XRCC4 wikipedia , lookup
DNA sequencing wikipedia , lookup
Homologous recombination wikipedia , lookup
DNA replication wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
DNA profiling wikipedia , lookup
DNA polymerase wikipedia , lookup
United Kingdom National DNA Database wikipedia , lookup
Microsatellite wikipedia , lookup
1 Design Automation for DNA Self-Assembled Nanostructures Constantin Pistol, Alvin R. Lebeck, Christopher Dwyer Duke University Design Automation Conference - July 27th, 2006 DNA Nano-Assembly Today Sung Ha Park, Constantin Pistol, Sang Jung Ahn, John H. Reif, Alvin R. Lebeck, Chris Dwyer, Thomas H. LaBean - "Finite-size, Fully-Addressable DNA Tile LatticesForme by Hierarchical Assembly Procedures", Angewandte Chemie No.5/2006 J. Sharma, R. Chhabra, Y. Liu, Y. Ke and H. Yan, DNATemplated Self-Assembly of Two-Dimensional and Periodical Gold Nanoparticle Arrays, Angewandte Chemie International Edition, 45 (2006), pp. 730-735. Y. He, Y. Tian, Y. Chen, Z. Deng, A. E. Ribbe and C. Mao, Sequence Symmetry as a Tool for Designing DNA Nanostructures, Angewandte Chemie International Edition, 44 (2005), pp. 6694-6696. How to drive this forward ? Constantin Pistol, Alvin R. Lebeck, Dan Sorin, Chris Dwyer - “Nanoscale Device Integration on DNA SelfAsembled Nanostructures”, Duke University, 2006 100 nm M. Mertig, W. Pompe: Biomimetic fabrication of DNA-based metallic nanowires and networks, in: Nanobiotechnology Concepts, Applications and Perspectives, Ed. by C.M. Niemeyer, C.A. Mirkin, WILEY-VCH Verlag GmbH & Co. KgaA, Weinheim, 2004, p.256-277 Nanowire Transistors, Gate Electrodes, and Their Directed Self-Assembly K. Skinner, R. L. Carroll, S. Washburn, and C. Dwyer. The 72nd Southeastern Section of the American Physical Society (SESAPS), November 2005 P. W. K. Rothemund, Folding DNA to create nanoscale shapes and patterns, Nature, 440 (2006), pp. 297-302. 3 Goal • Automate the design of DNA-based self-assembled structures - Find optimized sequences for given target structures - Define metrics for quantitative design comparison - Apply self-assembly specific design rules - Expose useful trade-offs to designers INPUT Structure Motif Seq. space CONSTRAINTS Set optimization Design rules Trade-offs OUTPUT Seq. map files manufacture ready 4 Outline 1. DNA Basics 2. DNA Self-Assembly and DNA Motifs 3. Metrics and Design Rules 4. Implementation 5. Evaluation 6. Conclusion 5 DNA Basics • A DNA strand is: - A linear array of bases (A, T, G, and C) - Directional (one end is distinct from the other) - In nature, the source of genetic information • DNA will form a double helix: - When the bases on each strand (aligned head-to-toe) are complementary: A with T, and G with C T A T A G T A T C A - But only under “natural” environmental conditionsC suchG G C A as (low) temperatures (sequence dependent) and in an TT A T A C G ionic solution. How does it form? 6 DNA Basics • DNA hybridization is the process that forms the double helix • Random diffusion: power of self-assembly • Sequence and temperature controls the hybridization event T • Reverse process called melting - Melting temperature (Tm) is sequence dependent - G-C pairs ~2x as strong as A-T pairs 7 DNA Basics • A common form of the double helix has some well-known geometric properties: - 3.4 Å per base pitch along the helix - One complete turn between every 10th and 11th base • Flexibility: the bonds along the sugar-phosphodiester backbone of each strand can rotate - Single stranded DNA has ~5 - 10 nm persistence length - Double stranded DNA has ~50nm persistence length Persistence length is “straight length” 8 Outline 1. DNA Basics 2. DNA Self-Assembly and DNA Motifs 3. Metrics and Design Rules 4. Implementation 5. Evaluation 6. Conclusion 9 Self-Assembly • Self-assembly is ubiquitous in nature • Generally defined as spontaneously generated order • Thermodynamics drive the self-assembly process - We can guide the process by the choice of materials and environmental conditions B A T A·B Advantages ? 10 Nanoscale self-assembly advantages • Size – feature sizes in the 1 - 20nm range • Count – simple parallel assembly (1014 – 1016) • Infrastructure – $460 billion chemical manufacturing industry • New material properties – leverage quantum effects • Potential for dense nano-scale active devices Why use DNA ? * 2002 US Census 11 Why DNA Based Self Assembly • DNA can provide substrate for fabricating nano-devices - Precise binding rules - Nanometer pitch • Nano-scale components placed / interconnected on substrate - Crossed carbon nanotube FETs - Ring-gated FETs - Nanowires - Quantum dots • Challenge: specify DNA sequences for - Intended geometry - Thermodynamic stability How to address this challenge? 12 Motifs and Hierarchical Assembly • Complex designs built from small set of building blocks (motifs) • Many possible motifs - Junctions (to form triangles, corners, etc.) - Sticky-ends: single strand of DNA protruding out of a helix • Two motifs with complementary sticky-ends bind to form a composite motif T T • Composite motifs can bind with other composite motifs to form larger composite motifs Sticky ends = “smart” glue Structure design = hierarchical structuring of motifs 13 Cruciform Motif • Motif* has 9 DNA single strands Arm 1 - Arms end in 5bp sticky ends Shell 4 Shell 1 Core Arm 4 - Two sticky ends per arm Shell 3 Arm 2 Shell 2 • Not exactly flat* - Some structural curvature Arm 3 • Core strand can be functionalized - Molecular addressability - Attach nanoscale components * H. Yan, S. H. Park, L. Feng, G. Finkelstein, J. H. Reif, and T. H. LaBean, "4x4 DNA Tile and Lattices: Characterization, Self-Assembly, and Metallization of a Novel DNA Nanostructure Motif," in Proceedings of the Ninth International Meeting on DNA Based Computers (DNA9), 2003. 14 Target System – 4x4 Grid • 16 active access points on flat 60nm x 60nm grid • “Nano-board” scaffold – “pin” device elements to it T1 T2 T16 • Hierarchical 2-level assembly 60nm How to design this system? Core and shells reused - fixed sequences Sung Ha Park, Constantin Pistol, Sang Jung Ahn, John H. Reif, Alvin R. Lebeck, Chris Dwyer, Thomas H. LaBean - "Finite-size, FullyAddressable DNA Tile Lattices Formed by Hierarchical Assembly Procedures", Angewandte Chemie No. 5/2006 15 Design Automation – 4x4 Grid • Complex simultaneous interaction – 16 motifs with 48 arms • Design targets: Target 1 : design optimized 96 sticky end set Target 2 : eliminate curvature (flat grid) Target 3 : control & optimize self-assembly process OUTPUT Constraints INPUT Structure Motif Seq. space T1 T2 T3 DNA Design Automation Software Sequence map How to evaluate these targets? What metrics? 16 Outline 1. DNA Basics 2. DNA Self-Assembly and DNA Motifs 3. Metrics and Design Rules 4. Implementation 5. Evaluation 6. Conclusion 17 Target 1: Sequence design • Generate optimized DNA sequence set for a given target structure • Find sequences that - Minimize strength of unintentional interactions - Maximize strength of intentional interactions • Two metrics - SEM: average single-interaction energy measure (stability) - TLM: average target-interaction likelihood measure (accuracy) • Designer can add tradeoff between accuracy and stability 18 SEM and TLM Metrics Specific Tm : Melting temp. with the complement strand Non-Specific Tm Non-Specific Tm : Highest melting temp. with a non-complementary strand TLM SEM = Spec[strand] • SEM and TLM for single sequence point NonSpec[strand] Melting temp. of strand pairs calculated using modified version of MELTING4* software Specific Tm * N. Le Novere, "MELTING, computing the melting temperature of nucleic acid duplex," Bioinformatics, vol. 17, pp. 1226-1227, 2001 19 SEM and TLM Metrics – Stability and Accuracy EXISTING (E) G-C-C C-G-G CANDIDATE 1 (C1) CANDIDATE 2 (C2) A-C-C T-G-G A-T-A T-A-T • Assume sequences E are present in system (sticky end) • Need to add an additional sticky end • Which is better: C1 or C2 ? G-C pair stronger than A-T - Set E + C1 has high SEM and low TLM - High stability (melting temperature) but low accuracy (more defects) - Set E + C2 has lower SEM and high TLM - Lower stability but higher accuracy (less defects) 20 Target 2: Flat Grid - Corrugation design rule – alternate motif normals Face UP Face DOWN 21 Target 3: Optimize Assembly - Thermal ordering design rules - hierarchy of sub-products Thermal groups 1st High 2nd 3rd 4th Low Temperature 22 Outline 1. DNA Basics 2. DNA Self-Assembly and DNA Motifs 3. Metrics and Design Rules 4. Implementation 5. Evaluation 6. Conclusion 23 Thermodynamic Optimization Software °C • Optimize target design against TLM and SEM metrics given: - Target topology - Basic motif design Strand A Maximum Tm(A,B) • Exhaustive thermodynamic search Strand Function 1 – 4 : Shell 5 : Core 6 – 9 : Arms Strand B - Creates detailed sequence interaction map • Evaluate each sticky-end sequence - Against all other candidate and motif sequences - Map their mutual interaction • High computational cost - 4x4 grid: 6 CPU-years to design and verify - Parallel implementation 24 Thermodynamic Interaction • Calculate melting temperature for each interacting pair – used for SEM, TLM metrics • Modified nearest-neighbor algorithm using MELTING4* tool - Energy contribution of each base also depends on neighbors - Handles internal and terminal mismatches • The number of interactions to evaluate depends on: - Length of sticky-ends - Extent of fixed motif sequences Sticky-end length Sticky-end Interactions 5bp > 500,000 10bp > 500,000,000,000 25 * N. Le Novere, "MELTING, computing the melting temperature of nucleic acid duplex," Bioinformatics, vol. 17, pp. 1226-1227, 2001 Design Automation Suite (DAS) • Input: structure + motifs + sequence space • Software runs thermodynamic analysis - Generates metric-annotated DNA sequence sets - Default: optimize for maximal TLM - SF (Stability Factor): trade TLM for SEM • Design rules are applied on candidate set Constraints • Designer can customize set - Corrugation and thermal ordering • Output: sequence map files - For order and manufacturing 26 Outline 1. DNA Basics 2. DNA Self-Assembly and DNA Motifs 3. Metrics and Design Rules 4. Implementation 5. Evaluation • Target systems and methods • Theoretical metric-based evaluation • Experimental evaluation 6. Conclusion 27 Evaluation: target systems • Large design - 96 sticky-ends set - 4x4 grid with 16 motifs • Small design – 20 sticky-ends set - AB Polymer* with 4 motifs A-Core B-Core A,B: Two sets of fixed sequences What is the impact of fixed sequences ? * H. Yan, S. H. Park, L. Feng, G. Finkelstein, J. H. Reif, and T. H. LaBean, "4x4 DNA Tile and Lattices: Characterization, Self-Assembly, and Metallization of a Novel DNA Nanostructure Motif," in Proceedings of the Ninth International Meeting on DNA Based Computers (DNA9), 2003. 28 Impact of fixed sequences - Accuracy in arm space with different fixed sequence sets Accuracy of the 5 bp arm space Accuracy (TLM) 12 10 8 6 AB Cores A-only B-only No Core 4 2 0 1 28 55 82 109 136 163 190 217 244 271 298 325 352 379 406 433 460 487 514 541 568 595 A-only, B-only: Arm Identifier - same target structure as AB Cores - use a single core/shells set (fewer fixed sequences) More fixed sequences = fewer “good” arms to choose from 29 Evaluation: Sequence Design Methods • DNA Design Automation Suite (DAS) + Thermodynamic analysis of all possible interactions + High flexibility, detailed sequence interaction map - Computationally expensive • Original 20-Arm design – based on existing sequence design tools* - Text-distance primitive - GC content as energy approximation + Fast but no result / local minimum for large seq. spaces • Random sequence sets - No guarantees of optimality + Fast 30 *N. C. Seeman, "De Novo Design of Sequences for Nucleic Acid Structural Engineering," Biomolecular Structure & Dynamics, vol. 8, pp. 573-581, 1990. Sequence design evaluation – 20-Arm • 20-Arm target system: upper bounds and original design SEM and TLM (20 Arm) 20 18 16 14 12 SEM 10 TLM 8 6 4 2 0 Upper Bound DAS - B Core SF=4 DAS - AB Split SF=7 Original AB Random DAS designs significantly improve both accuracy and stability 31 Sequence design evaluation – 96-Arm • 96-Arm target system: B-only, AB Core and Random 25 20 B-only = highest accuracy (TLM) Non-Specific Tm 15 10 SEM and TLM (96 Arm) 5 12.00 1:1 diagonal 0 10.00 8.00 -5 SEM 6.00 TLM -10 4.00 Random AB Cores B-Only -15 2.00 -20 0.00 0 5 10 15 Specific Tm 20 25 B-Only AB Cores Random 32 Experimental results – 96-Arm target system • Atomic Force Microscopy – 4x4 grids • Molecular height map on flat plane 60nm Sung Ha Park, Constantin Pistol, Sang Jung Ahn, John H. Reif, Alvin R. Lebeck, Chris Dwyer, Thomas H. LaBean - "Finite-size, FullyAddressable DNA Tile Lattices Formed by Hierarchical Assembly Procedures", Angewandte Chemie No. 5/2006 33 Grid – no detectable curvature • AFM height section shows flat grid • Corrugation design rules successful 34 DNA scaffold – experimental validation 60nm Trivia: The collection of books and manuscripts in the Library of Congress contains ~1014 letters. Manufacturing scale: ~1014 letters/mL! 35 Conclusion • Design automation for DNA-based self-assembled structures • Step forward towards nano-scale device engineering • Input: structure, motifs, sequence space • Constraints: - Optimized sequences for given target structures - Self-assembly specific design rules - Design trade-offs and metrics for quantitative design comparison • Output: sequence map for manufacturing 36 Thank you! Design Automation Conference - July 27th, 2006 37 Balanced designs • Trade-off: accuracy for stability (TLM for SEM) SF Impact on SEM and TLM (20 Arms) 14 14 12 12 10 10 TLM 16 SEM 16 8 6 8 6 AB SEM 4 B_Only SEM 2 2 0 0 0 0.5 1 2 3 SF 4 5 6 7 AB TLM B_Only TLM 4 0 0.5 1 2 3 SF 4 5 6 7 SF: Stability Factor – increases SEM, decreases TLM 38 SEM and TLM: Set Metrics • Average of single sequence metrics Tm[seqi,seqj] = melting temperature of seqi and seqj comp[seqi] = sequence complement of seqi 39