* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Powerpoint format
Eukaryotic DNA replication wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
DNA sequencing wikipedia , lookup
DNA repair protein XRCC4 wikipedia , lookup
Homologous recombination wikipedia , lookup
DNA replication wikipedia , lookup
DNA profiling wikipedia , lookup
DNA polymerase wikipedia , lookup
Microsatellite wikipedia , lookup
DNA nanotechnology wikipedia , lookup
CSE 599 Lecture 4: DNA computing
 Cells process and store information
 DNA forms an instruction manual
for the chemical processes in a cell
 DNA stores hereditary information
passed from parents to offspring
 Information is encoded digitally as
nucleotide sequences in the DNA
Thanks to Chris Diorio and Doug Zonker
for some of the slides
Deoxyribonucleic acid (DNA) (a) inside,
and (b) outside, the cell nucleus
Used figures from “Understanding DNA”
by Calladine and Drew
R. Rao, Week 4: DNA computing
1
Digital Representation
 Rather than thinking “biology” or “molecule,” think in terms
of digital information storage using the alphabet A, T, G, and C
R. Rao, Week 4: DNA computing
2
Biomolecular computing: Basic Idea
 A DNA strand encodes a quaternary (2-bits/base) string
 Can use molecular techniques to manipulate strings
 Synthesize, cut, splice, copy, replicate and read DNA molecules
 Separate and classify strings according to their size or content
 These processes are slow but massively parallel
 DNA for general-purpose digital computation
 Encode: Map problem onto DNA strands
 Exhaustive Search:
 Generate all possible solutions by subjecting strands
simultaneously to biochemical reactions
 Use molecular techniques to eliminate invalid solutions
 The result: Turing Universal DNA computing
R. Rao, Week 4: DNA computing
3
DNA primer...
 DNA provides cells with long-term information storage
 Resides within cell nucleus
 Provides templates for protein manufacture
 mRNA is a temporary copy created from DNA
 Migrates out of nucleus into cytoplasm
 Ribosomes read mRNA to create proteins
 Assisted by tRNA
 Proteins perform cell functions
R. Rao, Week 4: DNA computing
4
Components of DNA/RNA
 nitrogenous bases
 purines
 pyrimidines
 thymine in DNA
 uracil in RNA
 pentose sugar
 2-deoxyribose in DNA
 ribose in RNA
 phosphate group
R. Rao, Week 4: DNA computing
5
Nucleotides
 base + sugar = nucleoside
 nucleoside + phosphate = nucleotide
 bases are linked into a chain by
alternating sugars and phosphates
 direction is significant
 read from 5’ end to 3’ end
R. Rao, Week 4: DNA computing
6
DNA base pairing (Watson-Crick
Complementarity)
 Antiparallel strands form
hydrogen bonds between bases
 Pairing of bases
 cytosine  guanine (C-G)
thymine  adenine (T-A)
R. Rao, Week 4: DNA computing
7
DNA double helix
 Cells are filled with water
 Sugar–phosphate group is hydrophilic
 Bases are hydrophobic
 DNA double-strand twists to shield bases from water
 10 – 12 bp/turn
 Forms a helix
 Human DNA strands can be 3cm long
 but only 20Å in diameter
R. Rao, Week 4: DNA computing
8
DNA replication
“…it has not escaped our notice that the
specific pairing we have postulated
immediately suggests a possible copying
mechanism for the genetic material.”
- Watson & Crick
R. Rao, Week 4: DNA computing
9
mRNA production from DNA
 Producing mRNA
1. Unwind section of DNA
2. Catalyze mRNA using RNA
polymerase
 Base pairing
 40 bases/sec at 37° C
3. DNA reforms into double helix
4. mRNA leaves nucleus
R. Rao, Week 4: DNA computing
10
Amino acids and Proteins
 20 amino acids. Each is a carbon atom with:
 Amino (NH2) and carboxyl (COOH) groups
 A H+ atom (except proline) & something else
 Protein: A chain of amino acids
 Order of amino acids is primary structure
 Backbone folding gives secondary structure
R. Rao, Week 4: DNA computing
11
mRNA allows protein synthesis
 Base triplets (codons) code for
amino acids
 Ribosomes serve as decoding
machines
 tRNA is the adapter molecule
 One end of tRNA carries an
amino acid
 Other end carries an anticodon
that matches codon on mRNA
strand
 When a ribosome finds a tRNA
with a matching anticodon,
amino acid is broken off and
attached to polypeptide chain
R. Rao, Week 4: DNA computing
12
The genetic code
 Triplets code for amino
acids
 AUG signals start of
translation
 UAA, UAG, UGA signal
end of translation
 Redundancy: Many triplets
may code for 1 amino acid
R. Rao, Week 4: DNA computing
13
A List of Molecular Techniques and Tools
 Separating and fusing DNA strands: Denaturation (melting)
and Hybridization (also called annealing or renaturation)
 Amplifying DNA: PCR (polymerase chain reaction)
 Shortening and Cutting DNA (based on exonucleases and
endonucleases)
 Determining the length of DNA (gel electrophoresis)
 Reading the contents of DNA (DNA sequencing)
R. Rao, Week 4: DNA computing
14
Denaturation and Hybridization
 Double helix can be denatured
by heating (85-95 degrees C)
 Denaturing is reversible by
cooling (renaturing)
 Called hybridization when DNA
is from different sources (e.g.
DNA and RNA)
 The ability of two nucleic acid
preparations to hybridize is a
precise test for Watson-Crick
complementarity of their base
sequences
R. Rao, Week 4: DNA computing
DNA melting from heating
15
PCR amplifies DNA
 PCR: Polymerase chain reaction
 Polymerase: Enzyme that adds
nucleotides to an existing DNA
strand in the 5’-3’ direction
 Amplifies short segments of DNA
 Doubling in ~5 minutes
 Segment must be bracketed by
known primer sequences
 ~20 bases
R. Rao, Week 4: DNA computing
16
Enzymes that shorten or cut DNA
 Exonucleases shorten DNA
 Remove nucleotides one at a
time from the ends of DNA
molecules. E.g. ExonucleaseIII
removes nucleotides from the
two 3’ ends
 Endonucleases cut DNA
 E.g. Restriction enzymes such as
EcoRI recognize a short
sequence of DNA and cut the
molecule at that site
 Recognition site typically 4-6
bases (e.g. GAATTC)
 Sticky ends – overhanging ends
of DNA available for bonding
R. Rao, Week 4: DNA computing
17
Gel electrophoresis determines length
 DNA molecules are negatively
charged
 Place DNA on gel in electric field
 DNA molecules drift through gel
toward positive electrode
 Small molecules move faster
through gel than large ones
 Deactivate field when first
molecules reach positive electrode
 Determine lengths by comparing
distance of a sample with distance
traveled by control fragments with
known lengths
R. Rao, Week 4: DNA computing
18
Sequencing DNA
 Break DNA at some instance of
known site
 E.g. break DNA strands at G
 Determine length of broken strands
 Do this also for A, T, and C.
 Electrophorese the results on one gel
 Read out sequence from right to left
R. Rao, Week 4: DNA computing
19
DNA computing
 Field started by Leonard M. Adleman (USC)
 Used DNA strands and molecular techniques to solve a simple
Hamiltonian path problem:
 Find a path that visits all vertices once and only once
 Nov. '94 issue of Science magazine
 Molecular Computations of Solutions to Combinatorial Problems
 Laboratory experiment
 Constructed DNA molecules representing the possible solutions
to a 7-city travelling salesperson problem
 Details: see copy of paper that was handed out in class
R. Rao, Week 4: DNA computing
20
The computational premise
 Construct a DNA molecule for each potential solution
 Generate candidate solutions in parallel
 Use molecular operations to eliminate invalid solutions
 Five basic operations
 Extract: Separates 1 DNA tube into:
 One tube with all molecules containing a particular substring
 Another with the remaining molecules
 Merge: Mixes two tubes
 Detect: Checks if there are any DNA strands in a tube
 Copy: Amplifies the strands in a tube
 Append: Attaches a string to the end of every molecule in a tube
R. Rao, Week 4: DNA computing
21
Adleman’s DNA-based encoding of graphs
Input graph:
Encoding:
R. Rao, Week 4: DNA computing
22
Basic Steps in Adleman’s Algorithm
 Input: A directed graph G with n vertices, a start vertex vin and
a stop vertex vout
 Step 1: Generate paths in G randomly in large quantities
 Step 2: Reject all paths that do not begin with vin and end in
vout
 Step 3: Reject all paths that do not involve exactly n vertices
 Step 4: For each vertex v, reject all paths that do not involve v
 Output: “Yes” if any path remains, “No” otherwise
R. Rao, Week 4: DNA computing
23
Adleman’s experiment
 Took 7 days to solve 7-city problem, mainly due to
laboratory-related set-up time; Robotic manipulators could
speed things up
 All steps are amenable to molecular implementation
 Related Problems:
 SAT: Solution proposed by Lipton
 Created a directed graph whose paths correspond to all possible
Boolean assignments of variables
 Search paths for a satisfiable assignment according to the
structure of input formula
 Cracking the DES (data encryption standard)
 Search for correct 56-bit key given (plaintext, cryptotext) pairs
 Not done yet: at 1 operation/hour, requires 9 months
R. Rao, Week 4: DNA computing
24
Reasons to be optimistic...
 DNA computing is orders of magnitude more energy and
density efficient than digital computers
 Employs massive parallelism
 Field is only 7 years old, so many untried paths
 Example: Use the structure of DNA and proteins to compute?
 Living cells hold many secrets
 Copy their information-processing approaches
 Possibly use living cells in computing systems
 DNA can form self-assembling structures
 Analogous to cellular automata
R. Rao, Week 4: DNA computing
25
Reasons to be pessimistic...
 Generate-and-test approach requires one strand of DNA for
each candidate solution
 270 DNA strands of length 1000 is 8 kilograms
 DNA processing is slow and error prone
 1 hour per reaction
 Approximate matches and mutations may give incorrect results
 Need to learn to build reliable computers from noisy components
 No communication between strands
 No easy way to determine if a tube contains two identical strands
 No killer app has been identified yet
R. Rao, Week 4: DNA computing
26
Homework Assignment (due in two weeks)
 Solve the SUBSET SUM problem using DNA computing
 SUBSET SUM: Given a set of N positive integers S0, S1, …
, SN and a positive integer T (the "target"), is there some
subset of these integers Si (with possible repetitions) that
sums exactly to T?
 Examples:
 Input: S = { 2, 4, 6, 8, 10 }, T = 12; Answer: Yes
 Input: S = { 2, 4, 6, 8, 10 }, T = 13; Answer: No
 Input: S = { 1, 5, 4, 2, 7, 2, 12, 19, 17}, T = 42; Answer: Yes
 (T = 1 + 5 + 5 + 2 + 12 + 17 or 1 + 1 + … + 1)
R. Rao, Week 4: DNA computing
27
Homework Assignment (cont.)
 Three parts:
 Encode a given problem instance using DNA strands
 List the steps that will allow you to extract an answer
 Implement your idea using the Strand software package for high
level simulation of DNA computing
 Strand C++ Class Library:
 Simulates the creation of DNA strands
 Basic representation: short strand of DNA = an “Element”
 Does not use individual bases or base sequences
 High level simulation of typical operations performed in DNA
computing: melt, anneal, cut, detect, extract, remove, pour, append,
read, and length.
 Documentation: http://www.lut.fi/~kyrki/dna/doku.html
R. Rao, Week 4: DNA computing
28
Next Week: Fundamentals of Neurobiology
 No homework due next week.
 Read the on-line articles for additional information on DNA
computing
 Download and test the simulator using sample programs for
the Hamiltonian path and SAT problems
 Contact TA or instructor if you have any questions or
problems regarding the DNA computing assignment
 Have a great weekend!
R. Rao, Week 4: DNA computing
29