Download Knowledge Representation (and some more Machine Learning)

Knowledge Representation and Machine Learning Stephen J. Guy Overview  Recap some Knowledge Rep.    Machine Learning     History First order logic ANN Bayesian Networks Reinforcement Learning Summary Knowledge Representation?  Ambiguous term    “The study of how to put knowledge into a form that a computer can reason with” (Russell and Norvig) Originally couple w/ linguistics Lead to philosophical analysis of language Knowledge Representation?   Cool Robots Futuristic Robots Early Work  SAINT (1963)   Closed form Calculus Problems STUDENT (1967)   “If the number of customers Tom gets is twice the square of 20% of the number of advertisements he runs, and the number of advertisements he runs is 45, what is the number of customers Tom gets? Blockworlds (1972)   2 x  x SHRDLU “Find a block which is taller than the one you are holding and put it in the box” Early Work - Theme  Limit domain     “Microworlds” Allows precise rules Generality Problem Size   1) Making rules are hard 2) State space is unbounded Generality  First-order Logic      Is able to capture simple Boolean relations and facts x y Brother(x,y)  Sibling(x,y) x y Loves(x,y) Can capture lots of commonsense knowledge Not a cure-all First order Logic - Problems   Faithful captures fact, objects and relations Problems     Does not capture temporal relations Does not handle probabilistic facts Does not handle facts w/ degrees of truth Has been extended to:    Temporal logic Probability theory Fuzzy logic First order Logic - Bigger Problem   Still lots of human effort “Knowledge Engineering”     Time consuming Difficult to debug Size still a problem Automated acquisition of knowledge is important Machine Learning    Sidesteps all of the previous problems Represent Knowledge in a way that is immediately useful for decision making 3 specific examples    Artificial Neural Networks (ANN) Bayesian Networks Reinforcement Learning Artificial Neural Networks (ANN)    1st work in AI (McCulloch & Pitts, 1943) Attempt to mimic brain neurons Several binary inputs, One binary output Inputs: I1, I2, … Responses: R1, R2, … Output: O N O   Rn I n  threshold n 0 Artificial Neural Networks (ANN) Inputs: I1, I2, … Responses: R1, R2, … Output: O N O   Rn I n  threshold n 0  Can be chained together to    Represent logical connectives (and, or, not) Compute any computable functions Hebb (1949) introduced simple rule to modify connection strength (Hebbian Learning) Single Layer feed-forward ANNs (Perceptrons) Input Layer Output Unit N O   Rn I n  threshold n 0  Can easily represent otherwise complex (linearly separable) functions     And, Or Majority Function Can Learn based on gradient descent Cannot tell if 2 inputs are different!! (Minskey, 1969) Learning in Perceptrons    Replace Threshold function w/ Sigmod g(x) Define Error Metric (Sum Sqr Diff) Calculate Gradient wrt Weight   Err * g’(in) * Xj Wj = Wj +  * Err * g’(in) * Xj Multi Layer feed-forward ANNs Input Layer   Hidden Layer Output Unit Breaks free of problems of perceptions Simple gradient decent no longer works for learning Learning in Multilayer ANNs (1/2)  Backpropagation   Treat top level just like single-layer ANN Diffuse error down network based on input strength from each hidden node Learning in Multilayer ANNs (2/2)    i = Erri* g’(ini) Wj,i = Wj,i +  * aj * i Wk,j = Wk,j +  * ak * j ANN - Summery    Single Layer ANNs (Proceptrons) can capture linearly separable functions Multi-layer ANNs can caputer much more complex functions and can be effectively trained using backpropagation Not a silver bullet    How to avoid over-fitting? What shape should the network be? Network values are meaningless to humans ANN – In Robots (Simple)  Can be easily set up and robot Brian    Input = Sensors Output = Motor Control Simple Robot learns to avoid bumps ANN – In Robots (Complex)  Autonomous Land Vehicle In a Neural Network (ALVINN)      CMU project learned to drive from humans 32x30 “retina” 5 hidden layers 30 output nodes Capable of driving itself after 2-3 minutes of training Bayesian Networks    Combines advantages of basic logic and ANNs Allows for “effucient represenation of, and rigorous reasoning with, unceartain knwoledge” (R&N) Allows for learning from experience Bayes’ Rule   P(b|a) = P(a|b)*P(b)/P(a) = nrm(<P(a|b)*P(b), P(a|~b)*P(~b)>) Meningitis Example (From R&N)       s=stiff neck, m = has meningitis P(s|m) = 0.5 P(m) = 1/50000 P(s) = 1/20 P(m|s) = P(s|m)P(m)/P(s) = .5*(1/5000)/(1/2) = .0002 Diagnostic knowledge more fragile than causal knowledge Bayesian Networks Meningitis Stiff Neck P(M) = 1/50000 M P(S) T .5 F 1/20   Allows us to chain together more complex relations Creating network is not necessarily easy    Create a fully connected network Cluster groups w/ high correlation together Find probabilities using rejection sampling Bayesian Networks (Temporal Models)    Raint-1 Raint Raint+1 Umbrellat-1 Umbrellat Umbrellat+1 More complex Bayesian networks are possible Time can be taken into account Imagine predicting if it will rain tomorrow, based only on if your co-worker brings in an umbrella Bayesian Networks (Temporal Models)  Raint-1 Raint Raint+1 Umbrellat-1 Umbrellat Umbrellat+1 4 Possible Inference tasks based on this knowledge     Filtering – Computing belief as to current state Prediction – Computing belief of future state Smoothing – Improving knowledge of pasts states using hindsight (Forward-backward Algorithm) Most likely explanation – Finding the single most likely explanation for a set of observations (Viterbi) Bayesian Networks (Temporal Models)  Raint-1 Raint Raint+1 Umbrellat-1 Umbrellat Umbrellat+1 Assume you see umbrella 2 days in a row (U1= 1, U2 = 1)    P(R0) = <0.5,0.5> (<.5 R0 = T, .5 R0 = F>) P(R1) = P(R1|R0)*P(R0)+P(R1|~R0)*P(~R0) = 0.7*0.5 + 0.3*0.5 = <0.5,0.5> P(R1|U1) =nrm(P(U1|R1)*P(R1)) =nrm<.9*.5,.3*.5> =nrm<.45,.1> = <.818,.182> Bayesian Networks (Temporal Models)  Raint-1 Raint Raint+1 Umbrellat-1 Umbrellat Umbrellat+1 Assume you see umbrella 2 days in a row (U1= 1, U2 = 1)    P(R2|U1) = P(R2|R1)P(R1|U1)+ P(R2|~R1)P(~R1|U1) =.7*.818 + 0.3*0.182 = .627 = <.627,.373> P(R2|U2,U1) =nrm(P(U2|R2)*P(R2|U1)) =nrm<.9*.627,.2*.373> =nrm<.565,.075> = <.883,.117> On the 2nd day of seeing the umbrella we were more confident that it was raining Bayesian Networks - Summary  Bayesian Networks are able to capture some important aspects of human Knowledge Representation and use     Uncertainty Adaptation Still difficulties in network design Overall a powerful tool   Meaningful values in network Probabilistic logical reasoning Bayesian Networks in Robotics   Speech Recognition Inference     Sensors Computer Vision SLAM Estimating Human Poses Robot going through doorway using Bayesian networks (Univ. of Basque) Reinforcement Learning   How much can we take the human out of loop? How do humans/animals do it?     Genes Pain Pleasure Simply define rewards/punishments let agent figure out all the rest Reinforcement Learning - Example 1 .8 -1 .1 .1 start  R(s) = Reward of state s      R(Goal) = 1 R(pitfall) = -1 R(anything else) = ? Attempts to move forward may move left or right Many (~262,000) possible policies  Different policies are optimal depending on the value of R(anything else) Reinforcement Learning - Policy 1 -1 start   Above is Optimal policy for R(s) = -.04 Given a policy how can an agent evaluate U(s), the utility of a state? (Passive Reinforcement Learning)    Adaptive Dynamic Programming (ADP) Temporal Difference Learning (TD) With only an environment how can an agent develop a policy? (Active Reinforcement Learning)  Q-learning Reinforcement Learning - Utility 1 1 1 .812 2 -1 2 .762 3 1    3 start 2 3 4 start 1 .868 .918 .660 .655 2 .661 3 1 -1 .338 4 U(s) = R(s) + U(s’)P(s’) ADP: Updating all U(s) based on each new observation TD: Update U(s) only for last state change S’     Ideally: U(s) = R(s) + U(s’), but s’ is probabilistic U(s) = U(s) + (R(s)+U(s’)-U(s))  decays from 1 to 0 as a function of # times state is visited U(s) is guaranteed converge to correct value Reinforcement Learning – Policy    Ideally Agents can create their own policies Exploration: Agents must be rewarded for exploring as well as taking best known path Adaptive Dynamic Programming (ADP)     Temporal Difference Learning (TD)     Can be achieved by changing U(s) to U’(s) U’(s) = n< N ? Max_Reward : U(s) Agent must also update transition model No changes to utility calculation! Can explore based on balancing utility and novelty (like ADP) Can chose random directions with a decreasing rate over time Both converge on optimal value Reinforcement Learning in Robotics  Robot Control   Discretize workspace Policy Search    Pegasus System (Ng, Stanford) Learned how to control robots Better than human pilots w/ Remote Control Summary   3 different general learning approaches Artificial Neural Networks    Bayesian Networks    Good for learning correlation between inputs and outputs Little human work Good for handling uncertainty and noise Human work optional Reinforcement Learning    Good for evaluating and generating policies/behaviors Can handle complex tasks Little human work References        1. Russell S, Norvig P (1995) Artificial Intelligence: A Modern Approach, Prentice Hall Series in Artificial Intelligence. Englewood Cliffs, New Jersey (http://aima.cs.berkeley.edu/) 2. Mitchell, Thomas. Machine Learning. McGraw Hill, 1997. (http://www.cs.cmu.edu/~tom/mlbook.html) 3. Sutton, Richard S., and Andrew G. Barto. Reinforcement Learning. Cambridge, MA: MIT Press, 1998.(http://www.cs.ualberta.ca/~sutton/book/the-book.html ) 4. Hecht-Nielsen, R. "Theory of the backpropagation neural network." Neural Networks 1 (1989): 593-605. (http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=3401&arnumber=11 8638) 5. P. Batavia, D. Pomerleau, and C. Thorpe, Tech. report CMU-RI-TR-96-31, Robotics Institute, Carnegie Mellon University, October, 1996 (http://www.ri.cmu.edu/projects/project_160.html) 6. Bayesian Network based Human Pose Estimation D.J. Jung, K.S. Kwon, and H.J. Kim (Korea) (http://www.actapress.com/PaperInfo.aspx?PaperID=23199) 7. Frank L. Lewis, "Neural Network Control of Robot Manipulators," IEEE Expert: Intelligent Systems and Their Applications ,vol. 11, no. 3, pp. 6475, June, 1996. (http://doi.ieeecomputersociety.org/10.1109/64.506755)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Knowledge Representation (and some more Machine Learning)