Download Application of Unstructured Learning in Computational Biology

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Biology and consumer behaviour wikipedia , lookup

Transcript
Application of Unstructured
Learning in Computational Biology
Tony C Smith
Department of Computer Science
University of Waikato
[email protected]
Computability
Before computers were built,
mathematicians knew what they could do
• arithmetic (e.g. missile trajectories)
• search (e.g. keys for secret codes)
• sort (census information)
• … anything with a mathematical algorithm
Unstructured learning in computational biology
Tony C Smith
Artificial Intelligence
Computers do things
only human brains
can otherwise do
expert
Unstructured learning in computational biology
expert
Tony C Smith
Artificial Intelligence
Computers do things
only human brains
can otherwise do
expert
system
Unstructured learning in computational biology
expert
Tony C Smith
Artificial Intelligence
Computers do things
only human brains
can otherwise do
expert
system
Unstructured learning in computational biology
learning
system
Tony C Smith
Machine learning
What is machine learning?
creating computer programs that get better with experience
learn how to make expert judgments
discover previously hidden, potentially useful information (data
mining)
How does it work?
user provides learning system with examples of concept to be
learned
induction algorithm infers a characteristic model of the examples
model is used to predict whether or not future novel instances are
also examples – and it does this very consistently, and very, very
quickly!
Unstructured learning in computational biology
Tony C Smith
Structured learning
Mushroom Data
weight
Weight
Damage
Dirt
Firmness
Quality
heavy
heavy
normal
light
Light
normal
heavy
...
high
high
high
medium
clear
clear
medium
mild
mild
mild
mild
clean
clean
mild
hard
soft
hard
hard
hard
soft
hard
poor
poor
good
good
good
poor
poor
Unstructured learning in computational biology
heavy
dirt
normal
light
firmness
mild
clean
poor
good
hard
good
good
soft
poor
Tony C Smith
Unstructured learning
data does not have fixed fields with specific values
examples: images, continuous signals, expression data,
text
learning proceeds by correlating the presence or absence
of any and all salient attributes
Document Classification
given examples of documents covering some topic, learn a
semantic model that can recognize whether or not other
documents are relevant
prioritize them: i.e. quantify “how relevant” documents are
to the topic
not limited to keywords (nor is it misled by them)
adapt to the user’s needs (ephemeral or long-term)
Unstructured learning in computational biology
Tony C Smith
Document classification demo
Unstructured learning in computational biology
Tony C Smith
bioinformatics
Finding genes
Determining gene roles
Determining protein functions
•Empirical tests
•Sequence similarity comparison
•Literature
Unstructured learning in computational biology
Tony C Smith
GO-KDS demo
Unstructured learning in computational biology
Tony C Smith
Amino Acid
R group
Amide
group
Unstructured learning in computational biology
Carboxyl
group
Tony C Smith
Amino Acid
tyrosine
glycine
Unstructured learning in computational biology
Tony C Smith
DNA encodes amino acids
Unstructured learning in computational biology
Tony C Smith
Unstructured learning in computational biology
Tony C Smith
Unstructured learning in computational biology
Tony C Smith
Unstructured learning in computational biology
Tony C Smith
Rasmol demo
Unstructured learning in computational biology
Tony C Smith
Biotechnology
Biologists know proteins, computer
scientists know machine learning
Together, they can find out a lot of hidden
information about genes and proteins
Biotechnology is a multi-billion dollar
industry
Biotechnology is one of the best funded
areas of scientific research
Unstructured learning in computational biology
Tony C Smith