Download Knowledge Representation (and some more Machine Learning)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Embodied cognitive science wikipedia , lookup

Existential risk from artificial general intelligence wikipedia , lookup

Neural modeling fields wikipedia , lookup

Knowledge representation and reasoning wikipedia , lookup

Ethics of artificial intelligence wikipedia , lookup

Convolutional neural network wikipedia , lookup

Pattern recognition wikipedia , lookup

Concept learning wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Reinforcement learning wikipedia , lookup

Machine learning wikipedia , lookup

Hierarchical temporal memory wikipedia , lookup

Catastrophic interference wikipedia , lookup

Transcript
Knowledge Representation
and Machine Learning
Stephen J. Guy
Overview

Recap some Knowledge Rep.



Machine Learning




History
First order logic
ANN
Bayesian Networks
Reinforcement Learning
Summary
Knowledge Representation?

Ambiguous term



“The study of how to put knowledge
into a form that a computer can reason
with” (Russell and Norvig)
Originally couple w/ linguistics
Lead to philosophical analysis of
language
Knowledge Representation?


Cool Robots
Futuristic Robots
Early Work

SAINT (1963)


Closed form Calculus Problems
STUDENT (1967)


“If the number of customers Tom gets is twice the
square of 20% of the number of advertisements he
runs, and the number of advertisements he runs is
45, what is the number of customers Tom gets?
Blockworlds (1972)


2
x
 x
SHRDLU
“Find a block which
is taller than the one
you are holding and
put it in the box”
Early Work - Theme

Limit domain




“Microworlds”
Allows precise rules
Generality
Problem Size


1) Making rules are hard
2) State space is unbounded
Generality

First-order Logic





Is able to capture simple Boolean
relations and facts
x y Brother(x,y)  Sibling(x,y)
x y Loves(x,y)
Can capture lots of commonsense
knowledge
Not a cure-all
First order Logic - Problems


Faithful captures fact, objects and
relations
Problems




Does not capture temporal relations
Does not handle probabilistic facts
Does not handle facts w/ degrees of truth
Has been extended to:



Temporal logic
Probability theory
Fuzzy logic
First order Logic - Bigger Problem


Still lots of human effort
“Knowledge Engineering”




Time consuming
Difficult to debug
Size still a problem
Automated acquisition of knowledge
is important
Machine Learning



Sidesteps all of the previous
problems
Represent Knowledge in a way that
is immediately useful for decision
making
3 specific examples



Artificial Neural Networks (ANN)
Bayesian Networks
Reinforcement Learning
Artificial Neural Networks (ANN)



1st work in AI (McCulloch & Pitts, 1943)
Attempt to mimic brain neurons
Several binary inputs, One binary output
Inputs: I1, I2, …
Responses:
R1, R2, …
Output: O
N
O   Rn I n  threshold
n 0
Artificial Neural Networks (ANN)
Inputs: I1, I2, …
Responses:
R1, R2, …
Output: O
N
O   Rn I n  threshold
n 0

Can be chained together to



Represent logical connectives (and, or, not)
Compute any computable functions
Hebb (1949) introduced simple rule to
modify connection strength (Hebbian
Learning)
Single Layer feed-forward ANNs
(Perceptrons)
Input Layer
Output Unit
N
O   Rn I n  threshold
n 0

Can easily represent otherwise complex
(linearly separable) functions




And, Or
Majority Function
Can Learn based on gradient descent
Cannot tell if 2 inputs are different!!
(Minskey, 1969)
Learning in Perceptrons



Replace Threshold function w/
Sigmod g(x)
Define Error Metric (Sum Sqr Diff)
Calculate Gradient wrt Weight


Err * g’(in) * Xj
Wj = Wj +  * Err * g’(in) * Xj
Multi Layer feed-forward ANNs
Input Layer


Hidden Layer
Output Unit
Breaks free of problems of
perceptions
Simple gradient decent no longer
works for learning
Learning in Multilayer ANNs (1/2)

Backpropagation


Treat top level just like single-layer ANN
Diffuse error down network based on
input strength from each hidden node
Learning in Multilayer ANNs (2/2)



i = Erri* g’(ini)
Wj,i = Wj,i +  * aj * i
Wk,j = Wk,j +  * ak * j
ANN - Summery



Single Layer ANNs (Proceptrons) can
capture linearly separable functions
Multi-layer ANNs can caputer much
more complex functions and can be
effectively trained using backpropagation
Not a silver bullet



How to avoid over-fitting?
What shape should the network be?
Network values are meaningless to
humans
ANN – In Robots (Simple)

Can be easily set up and robot Brian



Input = Sensors
Output = Motor Control
Simple Robot learns to avoid bumps
ANN – In Robots (Complex)

Autonomous Land Vehicle In a
Neural Network (ALVINN)





CMU project learned to drive from
humans
32x30 “retina”
5 hidden layers
30 output nodes
Capable of driving
itself after 2-3
minutes of training
Bayesian Networks



Combines advantages of basic logic
and ANNs
Allows for “effucient represenation
of, and rigorous reasoning with,
unceartain knwoledge” (R&N)
Allows for learning from experience
Bayes’ Rule


P(b|a) = P(a|b)*P(b)/P(a) =
nrm(<P(a|b)*P(b), P(a|~b)*P(~b)>)
Meningitis Example (From R&N)






s=stiff neck, m = has meningitis
P(s|m) = 0.5
P(m) = 1/50000
P(s) = 1/20
P(m|s) = P(s|m)P(m)/P(s)
= .5*(1/5000)/(1/2)
= .0002
Diagnostic knowledge more fragile than
causal knowledge
Bayesian Networks
Meningitis
Stiff Neck
P(M) = 1/50000
M P(S)
T .5
F 1/20


Allows us to chain together more
complex relations
Creating network is not necessarily easy



Create a fully connected network
Cluster groups w/ high correlation together
Find probabilities using rejection sampling
Bayesian Networks (Temporal Models)



Raint-1
Raint
Raint+1
Umbrellat-1
Umbrellat
Umbrellat+1
More complex Bayesian networks are
possible
Time can be taken into account
Imagine predicting if it will rain tomorrow,
based only on if your co-worker brings in
an umbrella
Bayesian Networks (Temporal Models)

Raint-1
Raint
Raint+1
Umbrellat-1
Umbrellat
Umbrellat+1
4 Possible Inference tasks based on this
knowledge




Filtering – Computing belief as to current state
Prediction – Computing belief of future state
Smoothing – Improving knowledge of pasts
states using hindsight (Forward-backward
Algorithm)
Most likely explanation – Finding the single most
likely explanation for a set of observations
(Viterbi)
Bayesian Networks (Temporal Models)

Raint-1
Raint
Raint+1
Umbrellat-1
Umbrellat
Umbrellat+1
Assume you see umbrella 2 days in a
row (U1= 1, U2 = 1)



P(R0) = <0.5,0.5> (<.5 R0 = T, .5 R0 = F>)
P(R1) = P(R1|R0)*P(R0)+P(R1|~R0)*P(~R0)
= 0.7*0.5 + 0.3*0.5 = <0.5,0.5>
P(R1|U1) =nrm(P(U1|R1)*P(R1))
=nrm<.9*.5,.3*.5>
=nrm<.45,.1> = <.818,.182>
Bayesian Networks (Temporal Models)

Raint-1
Raint
Raint+1
Umbrellat-1
Umbrellat
Umbrellat+1
Assume you see umbrella 2 days in a row (U1=
1, U2 = 1)



P(R2|U1) = P(R2|R1)P(R1|U1)+ P(R2|~R1)P(~R1|U1)
=.7*.818 + 0.3*0.182 = .627 = <.627,.373>
P(R2|U2,U1) =nrm(P(U2|R2)*P(R2|U1))
=nrm<.9*.627,.2*.373>
=nrm<.565,.075> = <.883,.117>
On the 2nd day of seeing the umbrella we were more
confident that it was raining
Bayesian Networks - Summary

Bayesian Networks are able to
capture some important aspects of
human Knowledge Representation
and use




Uncertainty
Adaptation
Still difficulties in network design
Overall a powerful tool


Meaningful values in network
Probabilistic logical reasoning
Bayesian Networks in Robotics


Speech Recognition
Inference




Sensors
Computer Vision
SLAM
Estimating Human
Poses
Robot going through
doorway using Bayesian
networks (Univ. of
Basque)
Reinforcement Learning


How much can we take the human
out of loop?
How do humans/animals do it?




Genes
Pain
Pleasure
Simply define rewards/punishments
let agent figure out all the rest
Reinforcement Learning - Example
1
.8
-1
.1
.1
start

R(s) = Reward of state s





R(Goal) = 1
R(pitfall) = -1
R(anything else) = ?
Attempts to move forward may move left or right
Many (~262,000) possible policies

Different policies are optimal depending on the
value of R(anything else)
Reinforcement Learning - Policy
1
-1
start


Above is Optimal policy for R(s) = -.04
Given a policy how can an agent evaluate U(s), the
utility of a state? (Passive Reinforcement Learning)



Adaptive Dynamic Programming (ADP)
Temporal Difference Learning (TD)
With only an environment how can an agent develop a
policy? (Active Reinforcement Learning)

Q-learning
Reinforcement Learning - Utility
1
1
1
.812
2
-1
2
.762
3
1



3
start
2
3
4
start
1
.868
.918
.660
.655
2
.661
3
1
-1
.338
4
U(s) = R(s) + U(s’)P(s’)
ADP: Updating all U(s) based on each new
observation
TD: Update U(s) only for last state change
S’




Ideally: U(s) = R(s) + U(s’), but s’ is probabilistic
U(s) = U(s) + (R(s)+U(s’)-U(s))
 decays from 1 to 0 as a function of # times
state is visited
U(s) is guaranteed converge to correct value
Reinforcement Learning – Policy



Ideally Agents can create their own policies
Exploration: Agents must be rewarded for
exploring as well as taking best known path
Adaptive Dynamic Programming (ADP)




Temporal Difference Learning (TD)




Can be achieved by changing U(s) to U’(s)
U’(s) = n< N ? Max_Reward : U(s)
Agent must also update transition model
No changes to utility calculation!
Can explore based on balancing utility and novelty
(like ADP)
Can chose random directions with a decreasing rate
over time
Both converge on optimal value
Reinforcement Learning in Robotics

Robot Control


Discretize
workspace
Policy Search



Pegasus
System (Ng,
Stanford)
Learned how to
control robots
Better than
human pilots
w/ Remote
Control
Summary


3 different general learning approaches
Artificial Neural Networks



Bayesian Networks



Good for learning correlation between inputs
and outputs
Little human work
Good for handling uncertainty and noise
Human work optional
Reinforcement Learning



Good for evaluating and generating
policies/behaviors
Can handle complex tasks
Little human work
References







1. Russell S, Norvig P (1995) Artificial Intelligence: A Modern Approach,
Prentice Hall Series in Artificial Intelligence. Englewood Cliffs, New Jersey
(http://aima.cs.berkeley.edu/)
2. Mitchell, Thomas. Machine Learning. McGraw Hill, 1997.
(http://www.cs.cmu.edu/~tom/mlbook.html)
3. Sutton, Richard S., and Andrew G. Barto. Reinforcement Learning.
Cambridge, MA: MIT Press,
1998.(http://www.cs.ualberta.ca/~sutton/book/the-book.html )
4. Hecht-Nielsen, R. "Theory of the backpropagation neural network."
Neural Networks 1 (1989): 593-605.
(http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=3401&arnumber=11
8638)
5. P. Batavia, D. Pomerleau, and C. Thorpe, Tech. report CMU-RI-TR-96-31,
Robotics Institute, Carnegie Mellon University, October, 1996
(http://www.ri.cmu.edu/projects/project_160.html)
6. Bayesian Network based Human Pose Estimation D.J. Jung, K.S. Kwon,
and H.J. Kim (Korea)
(http://www.actapress.com/PaperInfo.aspx?PaperID=23199)
7. Frank L. Lewis, "Neural Network Control of Robot Manipulators," IEEE
Expert: Intelligent Systems and Their Applications ,vol. 11, no. 3, pp. 6475, June, 1996. (http://doi.ieeecomputersociety.org/10.1109/64.506755)