Download Slide Template

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Helicase wikipedia , lookup

DNA repair protein XRCC4 wikipedia , lookup

DNA repair wikipedia , lookup

DNA sequencing wikipedia , lookup

Homologous recombination wikipedia , lookup

DNA replication wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

DNA profiling wikipedia , lookup

DNA polymerase wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Replisome wikipedia , lookup

Microsatellite wikipedia , lookup

DNA nanotechnology wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
1
Design Automation for
DNA Self-Assembled Nanostructures
Constantin Pistol, Alvin R. Lebeck, Christopher Dwyer
Duke University
Design Automation Conference - July 27th, 2006
DNA Nano-Assembly Today
Sung Ha Park, Constantin Pistol, Sang Jung Ahn, John H. Reif, Alvin R. Lebeck, Chris
Dwyer, Thomas H. LaBean - "Finite-size, Fully-Addressable DNA Tile LatticesForme
by Hierarchical Assembly Procedures", Angewandte Chemie No.5/2006
J. Sharma, R. Chhabra, Y. Liu, Y. Ke and H. Yan, DNATemplated Self-Assembly of Two-Dimensional and Periodical
Gold Nanoparticle Arrays, Angewandte Chemie International
Edition, 45 (2006), pp. 730-735.
Y. He, Y. Tian, Y. Chen, Z. Deng, A. E. Ribbe and C. Mao, Sequence
Symmetry as a Tool for Designing DNA Nanostructures, Angewandte
Chemie International Edition, 44 (2005), pp. 6694-6696.
How to drive this forward ?
Constantin Pistol, Alvin R. Lebeck, Dan Sorin, Chris
Dwyer - “Nanoscale Device Integration on DNA SelfAsembled Nanostructures”, Duke University, 2006
100 nm
M. Mertig, W. Pompe: Biomimetic fabrication of DNA-based
metallic nanowires and networks, in: Nanobiotechnology Concepts, Applications and Perspectives, Ed. by C.M.
Niemeyer, C.A. Mirkin, WILEY-VCH Verlag GmbH & Co.
KgaA, Weinheim, 2004, p.256-277
Nanowire Transistors, Gate Electrodes, and Their Directed
Self-Assembly K. Skinner, R. L. Carroll, S. Washburn, and C.
Dwyer. The 72nd Southeastern Section of the American
Physical Society (SESAPS), November 2005
P. W. K. Rothemund, Folding DNA to create nanoscale shapes and
patterns, Nature, 440 (2006), pp. 297-302.
3
Goal
• Automate the design of DNA-based self-assembled structures
- Find optimized sequences for given target structures
- Define metrics for quantitative design comparison
- Apply self-assembly specific design rules
- Expose useful trade-offs to designers
INPUT
Structure
Motif
Seq. space
CONSTRAINTS
Set optimization
Design rules
Trade-offs
OUTPUT
Seq. map files
manufacture ready
4
Outline
1. DNA Basics
2. DNA Self-Assembly and DNA Motifs
3. Metrics and Design Rules
4. Implementation
5. Evaluation
6. Conclusion
5
DNA Basics
• A DNA strand is:
- A linear array of bases (A, T, G, and C)
- Directional (one end is distinct from the other)
- In nature, the source of genetic information
• DNA will form a double helix:
- When the bases on each strand (aligned head-to-toe) are
complementary: A with T, and G with C
T
A
T
A
G
T
A
T
C
A
- But only under “natural” environmental conditionsC suchG
G
C
A
as (low) temperatures (sequence dependent) and in
an TT
A
T
A
C
G
ionic solution.
How does it form?
6
DNA Basics
• DNA hybridization is the process that forms the double helix
• Random diffusion: power of self-assembly
• Sequence and temperature controls the hybridization event
T
• Reverse process called melting
- Melting temperature (Tm) is sequence dependent
- G-C pairs ~2x as strong as A-T pairs
7
DNA Basics
• A common form of the double helix has some well-known
geometric properties:
- 3.4 Å per base pitch along the helix
- One complete turn between every 10th and 11th base
• Flexibility: the bonds along the sugar-phosphodiester
backbone of each strand can rotate
- Single stranded DNA has ~5 - 10 nm persistence length
- Double stranded DNA has ~50nm persistence length
Persistence length is “straight length”
8
Outline
1. DNA Basics
2. DNA Self-Assembly and DNA Motifs
3. Metrics and Design Rules
4. Implementation
5. Evaluation
6. Conclusion
9
Self-Assembly
• Self-assembly is ubiquitous in nature
• Generally defined as spontaneously generated order
• Thermodynamics drive the self-assembly process
- We can guide the process by the choice of materials and
environmental conditions
B
A
T
A·B
Advantages ?
10
Nanoscale self-assembly advantages
• Size – feature sizes in the 1 - 20nm range
• Count – simple parallel assembly (1014 – 1016)
• Infrastructure – $460 billion chemical manufacturing industry
• New material properties – leverage quantum effects
• Potential for dense nano-scale active devices
Why use DNA ?
* 2002 US Census
11
Why DNA Based Self Assembly
• DNA can provide substrate for fabricating nano-devices
- Precise binding rules
- Nanometer pitch
• Nano-scale components placed / interconnected on substrate
- Crossed carbon nanotube FETs
- Ring-gated FETs
- Nanowires
- Quantum dots
• Challenge: specify DNA sequences for
- Intended geometry
- Thermodynamic stability
How to address this challenge?
12
Motifs and Hierarchical Assembly
• Complex designs built from small set of building blocks (motifs)
• Many possible motifs
- Junctions (to form triangles, corners, etc.)
- Sticky-ends: single strand of DNA protruding out of a helix
• Two motifs with complementary sticky-ends bind to form a
composite motif
T
T
• Composite motifs can bind with other composite
motifs to form
larger composite motifs
Sticky ends = “smart” glue
Structure design = hierarchical structuring
of motifs
13
Cruciform Motif
• Motif* has 9 DNA single strands
Arm 1
- Arms end in 5bp sticky ends
Shell 4
Shell 1
Core
Arm 4
- Two sticky ends per arm
Shell 3
Arm 2
Shell 2
• Not exactly flat*
- Some structural curvature
Arm 3
• Core strand can be functionalized
- Molecular addressability
- Attach nanoscale components
* H. Yan, S. H. Park, L. Feng, G. Finkelstein, J. H. Reif, and T. H. LaBean, "4x4 DNA Tile and Lattices: Characterization, Self-Assembly, and
Metallization of a Novel DNA Nanostructure Motif," in Proceedings of the Ninth International Meeting on DNA Based Computers (DNA9), 2003.
14
Target System – 4x4 Grid
• 16 active access points on flat 60nm x 60nm grid
• “Nano-board” scaffold – “pin” device elements to it
T1
T2
T16
• Hierarchical 2-level assembly
60nm
How to design this system?
Core and shells reused - fixed sequences
Sung Ha Park, Constantin Pistol, Sang Jung Ahn, John H. Reif, Alvin R. Lebeck, Chris Dwyer, Thomas H. LaBean - "Finite-size, FullyAddressable DNA Tile Lattices Formed by Hierarchical Assembly Procedures", Angewandte Chemie No. 5/2006
15
Design Automation – 4x4 Grid
• Complex simultaneous interaction – 16 motifs with 48 arms
• Design targets:
Target 1 : design optimized 96 sticky end set
Target 2 : eliminate curvature (flat grid)
Target 3 : control & optimize self-assembly process
OUTPUT
Constraints
INPUT
Structure
Motif
Seq. space
T1
T2
T3
DNA Design Automation Software
Sequence map
How to evaluate these targets? What metrics?
16
Outline
1. DNA Basics
2. DNA Self-Assembly and DNA Motifs
3. Metrics and Design Rules
4. Implementation
5. Evaluation
6. Conclusion
17
Target 1: Sequence design
• Generate optimized DNA sequence set for a given target structure
• Find sequences that
- Minimize strength of unintentional interactions
- Maximize strength of intentional interactions
• Two metrics
- SEM: average single-interaction energy measure (stability)
- TLM: average target-interaction likelihood measure (accuracy)
• Designer can add tradeoff between accuracy and stability
18
SEM and TLM Metrics
Specific Tm
: Melting temp. with the complement strand
Non-Specific Tm
Non-Specific Tm : Highest melting temp. with a non-complementary
strand
TLM
SEM = Spec[strand]
•
SEM and TLM for
single sequence point
NonSpec[strand]
Melting temp. of strand
pairs calculated using
modified version of
MELTING4* software
Specific Tm
* N. Le Novere, "MELTING, computing the melting temperature of nucleic acid duplex," Bioinformatics, vol. 17, pp. 1226-1227, 2001
19
SEM and TLM Metrics – Stability and Accuracy
EXISTING (E)
G-C-C
C-G-G
CANDIDATE 1 (C1)
CANDIDATE 2 (C2)
A-C-C
T-G-G
A-T-A
T-A-T
• Assume sequences E are present in system (sticky end)
• Need to add an additional sticky end
• Which is better: C1 or C2 ?
G-C pair stronger than A-T
- Set E + C1 has high SEM and low TLM
- High stability (melting temperature) but low accuracy (more defects)
- Set E + C2 has lower SEM and high TLM
- Lower stability but higher accuracy (less defects)
20
Target 2: Flat Grid
- Corrugation design rule – alternate motif normals
Face UP
Face DOWN
21
Target 3: Optimize Assembly
- Thermal ordering design rules - hierarchy of sub-products
Thermal groups
1st
High
2nd
3rd
4th
Low
Temperature
22
Outline
1. DNA Basics
2. DNA Self-Assembly and DNA Motifs
3. Metrics and Design Rules
4. Implementation
5. Evaluation
6. Conclusion
23
Thermodynamic Optimization Software
°C
• Optimize target design against TLM and SEM
metrics given:
- Target topology
- Basic motif design
Strand A
Maximum Tm(A,B)
• Exhaustive thermodynamic search
Strand Function
1 – 4 : Shell
5 : Core
6 – 9 : Arms
Strand B
- Creates detailed sequence interaction map
• Evaluate each sticky-end sequence
- Against all other candidate and motif sequences
- Map their mutual interaction
• High computational cost
- 4x4 grid: 6 CPU-years to design and verify
- Parallel implementation
24
Thermodynamic Interaction
• Calculate melting temperature for each interacting pair – used
for SEM, TLM metrics
• Modified nearest-neighbor algorithm using MELTING4* tool
- Energy contribution of each base also depends on neighbors
- Handles internal and terminal mismatches
• The number of interactions to evaluate depends on:
- Length of sticky-ends
- Extent of fixed motif sequences
Sticky-end length Sticky-end Interactions
5bp
> 500,000
10bp
> 500,000,000,000
25
* N. Le Novere, "MELTING, computing the melting temperature of nucleic acid duplex," Bioinformatics, vol. 17, pp. 1226-1227, 2001
Design Automation Suite (DAS)
• Input: structure + motifs + sequence space
• Software runs thermodynamic analysis
- Generates metric-annotated DNA sequence sets
- Default: optimize for maximal TLM
- SF (Stability Factor): trade TLM for SEM
• Design rules are applied on candidate set
Constraints
• Designer can customize set
- Corrugation and thermal ordering
• Output: sequence map files
- For order and manufacturing
26
Outline
1. DNA Basics
2. DNA Self-Assembly and DNA Motifs
3. Metrics and Design Rules
4. Implementation
5. Evaluation
• Target systems and methods
• Theoretical metric-based evaluation
• Experimental evaluation
6. Conclusion
27
Evaluation: target systems
• Large design - 96 sticky-ends set
- 4x4 grid with 16 motifs
• Small design – 20 sticky-ends set
- AB Polymer* with 4 motifs
A-Core
B-Core
A,B: Two sets of fixed sequences
What is the impact of fixed sequences ?
* H. Yan, S. H. Park, L. Feng, G. Finkelstein, J. H. Reif, and T. H. LaBean, "4x4 DNA Tile and Lattices: Characterization, Self-Assembly, and
Metallization of a Novel DNA Nanostructure Motif," in Proceedings of the Ninth International Meeting on DNA Based Computers (DNA9), 2003.
28
Impact of fixed sequences
- Accuracy in arm space with different fixed sequence sets
Accuracy of the 5 bp arm space
Accuracy (TLM)
12
10
8
6
AB Cores
A-only
B-only
No Core
4
2
0
1
28 55 82 109 136 163 190 217 244 271 298 325 352 379 406 433 460 487 514 541 568 595
A-only, B-only:
Arm Identifier
- same target structure as AB Cores
- use a single core/shells set (fewer fixed sequences)
More fixed sequences = fewer “good” arms to choose from
29
Evaluation: Sequence Design Methods
• DNA Design Automation Suite (DAS)
+ Thermodynamic analysis of all possible interactions
+ High flexibility, detailed sequence interaction map
- Computationally expensive
• Original 20-Arm design – based on existing sequence design tools*
- Text-distance primitive
- GC content as energy approximation
+ Fast but no result / local minimum for large seq. spaces
• Random sequence sets
- No guarantees of optimality
+ Fast
30
*N. C. Seeman, "De Novo Design of Sequences for Nucleic Acid Structural Engineering," Biomolecular Structure & Dynamics, vol. 8, pp. 573-581, 1990.
Sequence design evaluation – 20-Arm
• 20-Arm target system: upper bounds and original design
SEM and TLM (20 Arm)
20
18
16
14
12
SEM
10
TLM
8
6
4
2
0
Upper Bound
DAS - B Core
SF=4
DAS - AB Split
SF=7
Original AB
Random
DAS designs significantly improve both accuracy and stability
31
Sequence design evaluation – 96-Arm
• 96-Arm target system: B-only, AB Core and Random
25
20
B-only = highest accuracy (TLM)
Non-Specific Tm
15
10
SEM and TLM (96 Arm)
5
12.00
1:1 diagonal
0
10.00
8.00
-5
SEM
6.00
TLM
-10
4.00
Random
AB Cores
B-Only
-15
2.00
-20
0.00
0
5
10
15
Specific Tm
20
25
B-Only
AB Cores
Random
32
Experimental results – 96-Arm target system
• Atomic Force Microscopy – 4x4 grids
• Molecular height map on flat plane
60nm
Sung Ha Park, Constantin Pistol, Sang Jung Ahn, John H. Reif, Alvin R. Lebeck, Chris Dwyer, Thomas H. LaBean - "Finite-size, FullyAddressable DNA Tile Lattices Formed by Hierarchical Assembly Procedures", Angewandte Chemie No. 5/2006
33
Grid – no detectable curvature
• AFM height section shows flat grid
• Corrugation design rules successful
34
DNA scaffold – experimental validation
60nm
Trivia: The collection of books and
manuscripts in the Library of
Congress contains ~1014 letters.
Manufacturing scale: ~1014 letters/mL!
35
Conclusion
• Design automation for DNA-based self-assembled structures
• Step forward towards nano-scale device engineering
• Input: structure, motifs, sequence space
• Constraints:
- Optimized sequences for given target structures
- Self-assembly specific design rules
- Design trade-offs and metrics for quantitative design
comparison
• Output: sequence map for manufacturing
36
Thank you!
Design Automation Conference - July 27th, 2006
37
Balanced designs
• Trade-off: accuracy for stability (TLM for SEM)
SF Impact on SEM and TLM (20 Arms)
14
14
12
12
10
10
TLM
16
SEM
16
8
6
8
6
AB SEM
4
B_Only SEM
2
2
0
0
0
0.5
1
2
3
SF
4
5
6
7
AB TLM
B_Only TLM
4
0
0.5
1
2
3
SF
4
5
6
7
SF: Stability Factor – increases SEM, decreases TLM
38
SEM and TLM: Set Metrics
• Average of single sequence metrics
Tm[seqi,seqj] = melting temperature of seqi and seqj
comp[seqi] = sequence complement of seqi
39