Download No Slide Title - Department of Electrical Engineering and Computing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Endogenous retrovirus wikipedia , lookup

Restriction enzyme wikipedia , lookup

DNA repair protein XRCC4 wikipedia , lookup

Gene expression wikipedia , lookup

Gene wikipedia , lookup

Real-time polymerase chain reaction wikipedia , lookup

Genetic engineering wikipedia , lookup

Agarose gel electrophoresis wikipedia , lookup

DNA profiling wikipedia , lookup

Biochemistry wikipedia , lookup

SNP genotyping wikipedia , lookup

Multi-state modeling of biomolecules wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Community fingerprinting wikipedia , lookup

Nucleosome wikipedia , lookup

Genomic library wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Point mutation wikipedia , lookup

Molecular cloning wikipedia , lookup

Biosynthesis wikipedia , lookup

Transformation (genetics) wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Non-coding DNA wikipedia , lookup

DNA supercoil wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Transcript
ECECS 819: lecture 1—Introduction
Computational aspects of biological
systems
1
Biology—Macro and Micro Elements
E. coli
DNA
protein
E. coli chromosome
An amino acid (alanine)
2
Biosystem: an “information processing
system”
•“sensor” / “processor”/”actuator”
•Self-repairing
•Stores information
•Can interact with other systems (e.g.,
use of nerve signals to activate
devices)
•May be a “community” (e.g., coral,
fungus)3
Goal 1: Use “micro” elements as
information processing / storage
devices—”biomolecular computers”
E. coli
DNA
protein
E. coli chromosome
An amino acid (alanine)
4
Goal 2: Use computation to understand
biomolecular systems
E. coli
DNA
protein
E. coli chromosome
An amino acid (alanine)
5
Why Do We Need to Learn About
Biomolecular Computing?
Reason 1: “the disappearing transistor”
3m (lambda)
1.5m (lambda)
0.5m (lambda)
•By 2020, “gate” will be only one atom large
[Keyes, IBM]
• Candidate “new” technologies:
+quantum computing
+biomolecular computing
6
Relative sizes:
10-18: electron
10-15: proton, neutron
“nanotechnology”:
10-14: atomic nucleus
10-10: water molecule (angstrom)
molecules, atoms
10-9: (nanometer, nm), one DNA “twist”
10-8: wavelength of UV light
10-7: thickness of cell membrane
0.18 or 0.13 mm, Pentium 4 wire width
10-6: diameter of typical bacterium (micron, mm)
10-5: diameter of typical cell
2-10 mm, typical MEMS feature size
10-4: width of human hair
10-3: diameter of sand grain (millimeter, mm)
10-2: diameter of nickel (centimeter, cm)
35 mm--one side of Pentium 4 chip
100: 1 meter
7
Why Do We Need to Learn About
Biomolecular Computing?
Reason 2: a host of potential applications
•medical: diagnosis / treatment delivery / prosthetics
•lab diagnostics: health care / forensics / drug development
8
Why is biomolecular computing attractive?
•Size:
--typical bacterium has diameter on ht order of 10-6 m. (1
micron);
--one twist of DNA double helix is on the order of 10-9 m.
(nanometer scale)
•Power requirements should be low
•Massive parallel computation is theoretically possible
•I/O can be two-dimensional
•Instabilities of quantum systems are much less of a problem
here
9
What are the disadvantages?
•Speed--typical reaction can take hours or days
•Error rates--may be unacceptably high; may be introduced by
mechanical steps in proocessing data
•I/O--we do not yet have efficient mechanisms for doing
input/output with these systems
•“Herd” property--we can affect a mixture of data items; we
cannot in general pick out one specific item; biomolecular
computing is inherently parallel
•Exponential growth in size of computation--it may be that the
speed barrier in traditional computing is replaced by a size
barrier in biomolecular computing--we may need too much
biological material to solve a reasonable sized problem for the
“computation” to be feasible
10
Major drawback: typical engineers “don’t
know much about biology….”
•Biology is traditionally descriptive, rather than
computational (HUGE vocabulary)
•Biomolecular processes are incredibly complex and
many are not well understood
•Field is changing rapidly
•There are multiple paradigms for computing
available
11
Also, there are many different subfields:
bioinformatics: the application of computer technology to
the management of biological information
biomolecular computing: the use of biological and
chemical processes to perform computations
bio-inspired computing: the use of biological paradigms
(e.g., neural nets, genetic algorithms) in the design of
computational algorithms. Algorithms may be
implemented in any appropriate technology
neurocomputing:direct I/O from biological system;
interfacing directly with nervous system; currently using
traditional analog computing
12
And many computing paradigms:
DNA computing--uses physical structure of DNA
in vivo computing--uses biological processes,
e.g., protein synthesis, to perform computations
in silico computing--”traditional” computing; often
used to refer to programs that attempt to simulate
living organisms; sometimes referred to as “bioSpice”
13
So how can we get started?
Some important basic terms
(good reference: Brown, Genomes, Wiley-Liss, 1999):
14
•genome: biological information in an organism
•DNA: deoxyribonucleic acid, carries genome of cellular
lifeforms
•RNA: ribonucleic acid, carries genome of some viruses,
carries messages within the cell
•bases: the four bases found in DNA are
adenine (A), cytosine (C), guanine (G),
and Thymine (T); in a “double helix” of DNA,
bonds are always A--T or C--G; thus a single
strand of DNA carries the information about
the strand it would bond to
15
DNA—the “double helix”
16
•polynucleotide: a single DNA strand
•oligonucleotide: short, single-stranded DNA molecule,
usually less than 50 nucleotides in length
In DNA computing, specific oligonucleotides are constructed
to represent data items.
•nucleotide: phosphate group + sugar + one of the 4 bases
(A,C,G,T): the phosphate end is labeled 5’, the base end, 3’
Example: in Adelman’s seminal 1994 paper,
oligonucleotides of length 20 were built to represent vertices
and edges in a given graph:
A
Vertex V1
T
T
G
C
C
A
A
G
A
A
T
Vertex V2
Edge V1-V2
17
What interesting projects can build on our
knowledge of traditional computer
engineering?
• “structural” designs—DNA computing
• “chemical” designs—using proteins as signals
18
DNA computing (“structural”, “digital”)
Possible operations on DNA:
•building up custom oligonucleotide sequences to
represent parts of your data
•splitting--can be done by heating, e.g.
•recombining--can be done by cooling
•cutting strand at a particular site
•“sticking” two fragments together (at their ends)
•sorting by some string property (including length)
19
So-----DNA computing:
•uses structure of the DNA
•relies on mechanical operations
•answers “self-assemble”
•basic steps:
•encode the problem
•make a “solution” of problem fragments
•cool the solution so fragments will form longer strands
•filter out the answers you want
20
Example: solving graph problems
A
T
T
C
G
A
C
A
A
G
A
T
•Encode vertices and edges—use DNA properties to
encode graph “structure”
•Mix up a solution of your fragments
•Cool down, get resulting “paths”, “spanning trees”,
etc.
21
“Standard cell architectures, FPGAs”
Basic idea (after Prof. Tom Knight, MIT):
•“gates” are functional units
•Ends of gates are standard “join” DNA
sequences—reserved for this purpose
•So we can build computational chains easily
22
Other applications of DNA computing:
•general computing using “sticker” language
•study of relationship between traditional architectures
and DNA configurations:
---FSMs-linear DNA
---stack machines--branching DNA
---“Turing machines” (general purpose computers)-sheet DNA
23
Other applications of DNA computing
(continued):
•3-D self-assembled structures:
•“walking and rolling DNA”:
•structures for nanotube assembly: (recently reported in
Science)
24
in vivo computing (“chemical” / ”analog”):
uses processes within the cell (e.g., E. coli) as signals
model is closer to traditional computing, with electrical
signals replaced by chemical signals
many processes we would like to use are not well
understood
requires in silico computing to generate simulations of
biomolecular processes, similar to SPICE simulations in
traditional electrical circuits
this is a new and rapidly growing field with many potential
practical applications
25
“central dogma”:
DNA ----> RNA-----> protein
we can use the presence or absence of the protein to
indicate “1” or “0”
26
•Protein: like DNA, a protein is a
linear polymer. It is made of
units which are amino acids.
Proteins are very complex and
not completely understood.
Proteins have four levels of
structure:
•primary: the amino acids
bonded together
•secondary: typically either an
“alpha-helix” or a “beta-sheet”
•tertiary: formed from folding of
the secondary structure into a
three-dimensional configuration
•quartenary: formed by units
folded into the tertiary structure of
the protein
27
Some proteins:
http://www.biochem.szote.u-szeged.hu/astrojan/protein2.htm
28
•Central Dogma:
Before the discovery of
retroviruses and prions, this was
believed to be the basic
mechanism of inheritance in all
living things
29
•Plasmid: a “loop” of DNA used to introduce new
genetic material into a cell
•used for “genetic engineering”
•typically plasmid will also have
a section which ensures it will
have resistance to a particular
antibiotic; after insertion into
cell, this will provide a
marker to show that the new
DNA really has been
inserted
30
One possible simple mechanism:
inhibits
gene
DNA:
promoter
Transcript
RNA output
Translate
input
RNA
translate
Protein B input
Protein A output (detect by fluoresence)
Summary:
• 0 input --> output protein A (1);
• 1 input (RNA) ---> 0 output
31
Analogy to Electrical Inverter
32
Bio-Inverter Model [Weiss 1999]
33
Deterministic Vs Stochastic Model
• Deterministic Model
 Inverter modeled using a set of differential
equations with deterministic variables.
 No random components.
 Fixed order for reactions.
Stochastic Model
 Accounts for the random noise components.
 Simulations
under
different
environmental
conditions and other random noise variables.
 Random order for reactions.
34
Deterministic Simulation
35
Deterministic Simulation
Transient Characteristics (Matlab)
36
Deterministic Simulation (6)
Transient Characteristics (VHDL-AMS)
Deterministic Simulation—Example (5) Transient Characteristics
37
Deterministic Simulation Modified
Transient Characteristics
• The transient characteristics of the inverter are computed using
the modified reaction rates.
• The steady state output value has doubled since the
transcription rate is doubled (k7*2).
• The rise of the output has decreased to about 30 seconds and
the rise and fall times are equal.
• The reduction of repression rate and the dissociation rate
increase are the reasons for the decrease of the rise time.
38
Deterministic Simulation
Modified Transient Characteristics (Matlab)
39
Stochastic Simulation
• Stochastic simulation based on Gillespie algorithm
[Gillespie 1977].
• Two random variables (time and the type of reaction)
were introduced.
• In biology, the cell reaction occurs at random
intervals of time.
• The reactions do not occur in order and are random.
• Temperature fluctuations, decay rates and other
parameters also result in random noise.
40
Stochastic Simulation
41
Some areas to explore:
• Stochastic simulation—design space exploration
–
–
–
–
Similar to CAD tool development for digital and analog circuits
Currently trying simulated annealing, genetic algorithms
Many other strategies can be explored
Will also have applications in medical research
• Agent-based modeling and visualization
– 3D modeling and dynamic simulations using object-oriented
programming
• Engineering design process for biomolecular computing
applications
– Will modify traditional design flows for software, digital, and analog
circuits
– Will provide support to circuit designers and biomedical researchers
• Development of DNA “standard cells”
42