Download History and Philosophy of Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene regulatory network wikipedia , lookup

List of types of proteins wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Community fingerprinting wikipedia , lookup

Multi-state modeling of biomolecules wikipedia , lookup

Structural alignment wikipedia , lookup

Homology modeling wikipedia , lookup

Transcript
Discovering Structural Models
Lecture 19
Structural Models in Science
Structural models encode the spatial relationships among
the components of some physical object.
Examples of structural models include
• Bohr’s model of the atom;
• Watson and Crick’s double helix model of DNA;
• the composition and organization of molecules; and
• the geological strata of a particular region.
The discovery of structural models often serves as a first
step at explanation, moving beyond descriptive knowledge.
The computational discovery of structural models has
explored both historical cases and current needs.
Structural Discovery Systems
There are several types of structural discovery systems,
but Valdes-Perez (1993) provided a unified view of one set.
• STAHL discovers chemical compounds from reactions.
• DALTON discovers atomic models from chemical
reactions.
• MECHEM discovers chemical reaction pathways.
• GELL-MANN discovers the structure of elementary
particles.
• BR-3 discovers the properties of elementary particles.
One can view the search space of each system as a set of
• MENDEL
matrices
thatdiscovers
may growgenotype
in size asinteractions
the searchfrom
expands.
phenotypes.
DALTON
The DALTON system discovers the elemental structure of
molecules by reasoning about reaction equations.
Starting with (hydrogen oxygen ➞ water), DALTON can
ultimately determine the atomic components:
({{h h} {h h}} {{o o}} ➞ {{h h o} {h h o}})
This system’s result asserts that
• hydrogen and oxygen molecules are diatomic; and
• hydrogen and oxygen molecules combine in a 2:1 ratio
to produce 2 water molecules.
DALTON arrives at its discoveries through heuristic search
guided by knowledge available to 19th century chemists.
DENDRAL
The DENDRAL system discovers the chemical bonds in a
molecule given its formula and mass spectrogram.
From the formula C6H5OH and other relevant information,
DENDRAL can produce structures such as
H
H
C
C
HC
C
C
H
OH
C
H
Like DALTON, DENDRAL relies on heuristic search to
discover structural models.
However, DENDRAL incorporates knowledge from 20th
century chemistry to guide its more extensive search.
GELL-MANN
GELL-MANN discovers hidden structures within the context
of particle physics.
As input, GELL-MANN takes a collection of observed
particles and their properties.
As output, the system produces a set of components and
combinations that map to the particles.
For example, the system could consider a list of elementary
particles, such as the baryon octet on the next slide.
From this, it would conjecture the properties of quarks and
map the baryons to various arrangements of the quarks.
Zytkow and Fischer’s (1996) computational model of
structure discovery forms the basis of GELL-MANN.
GELL-MANN Example
Input:
particle
p
n
Σ+
Σ0
ΣΞ0
Ξ-
charge
1
0
1
0
-1
0
-1
Output:
isospin
1/2
-1/2
1
0
-1
1/2
-1/2
strange.
0
0
-1
-1
-1
-2
-2
From information about the
elementary particles, GELLMANN can infer the
standard quark model.
quark
u
d
s
part.
p
n
Σ+
Σ0
ΣΞ0
Ξ-
charge isospin strange.
2/3
1/2
0
-1/3
-1/2
0
-1/3
0
-1
ch.
1
1
0
1
0
-1
-1
0
-1
-1
iso.
str. quarks
0
1
uuu
1/2
0
uud
-1/2
0
uus
1
-1
udd
0
-1
uds
-1
-1
uss
-3/2
0
ddd
1/2
-2
dds
-1/2
-2
dss
0
-3
sss
Sequence Assembly Systems
DNA sequencing technologies reconstruct a complete
genome by examining large quantities of DNA fragments.
Sequence assembly systems read the gene sequence of
each fragment as text and return the genome.
Informatics tools such as ARACHNE and Celera Assembler
address the sequencing problem by
• finding repeated fragments;
• searching for overlapping fragments;
• correcting errors; and
• joining overlapping fragments into contiguous regions.
Several checks are made to ensure that the resulting
structure is well supported by the data.
Pathway Tools
An operon is a set of adjacent genes that are transcribed
together to produce multiple proteins.
Modern tools for bioinformatics support operon prediction in
bacteria, which yields information about gene function.
Pathway Tools, which powers BioCyc and EcoCyc, uses
several factors to predict operons in bacterial genomes:
• the distance between genes,
• the direction of transcription, and
• the functional knowledge in the knowledge base.
Researchers have applied operon prediction systems to a
variety of genomes, but validation has been problematic.
Experimental verification remains a necessary step.
Structural Modeling: Summary
As we have seen, structural modeling tasks appear in a
variety of scenarios and across scientific disciplines.
In addition to the few cases we have discussed, current
research on structural model creation includes
• identifying anatomical structure based on CT scans; and
• determining geological structure from seismic data.
Pathway tools differs in that it uses classifiers to identify
secondary structure in DNA sequences.
SedSim, which infers models of geological structures, uses
physics-based, dynamic models to build its models.
However, most of the systems that we discussed carry out
some form of heuristic search to build structural models.