* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Intelligent Environments
Embodied language processing wikipedia , lookup
Heritability of IQ wikipedia , lookup
Artificial intelligence wikipedia , lookup
Artificial general intelligence wikipedia , lookup
Philosophy of artificial intelligence wikipedia , lookup
Inductive probability wikipedia , lookup
Environmental enrichment wikipedia , lookup
Neuroeconomics wikipedia , lookup
History of artificial intelligence wikipedia , lookup
Intelligent Environments Computer Science and Engineering University of Texas at Arlington Intelligent Environments 1 Decision-Making for Intelligent Environments Motivation Techniques Issues Intelligent Environments 2 Motivation An intelligent environment acquires and applies knowledge about you and your surroundings in order to improve your experience. “acquires” prediction “applies” decision making Intelligent Environments 3 Motivation Why do we need decision-making? “Improve our experience” Usually alternative actions Which one to take? Example (Bob scenario: bedroom ?) Turn on bathroom light? Turn on kitchen light? Turn off bedroom light? Intelligent Environments 4 Example Should I turn on the bathroom light? Issues Inhabitant’s location (current and future) Inhabitant’s task Inhabitant’s preferences Energy efficiency Security Other inhabitants Intelligent Environments 5 Qualities of a Decision Maker Ideal Complete: always makes a decision Correct: decision is always right Natural: knowledge easily expressed Efficient Rational Decisions made to maximize performance Intelligent Environments 6 Agent-based Decision Maker Russell & Norvig “AI: A Modern Approach” Rational agent Agent chooses an action to maximize its performance based on percept sequence Intelligent Environments 7 Agent Types Reflex agent Reflex agent with state Goal-based agent Utility-based agent Intelligent Environments 8 Reflex Agent Intelligent Environments 9 Reflex Agent with State Intelligent Environments 10 Goal-based Agent Intelligent Environments 11 Utility-based Agent Intelligent Environments 12 Intelligent Environments Decision-Making Techniques Intelligent Environments 13 Decision-Making Techniques Logic Planning Decision theory Markov decision process Reinforcement learning Intelligent Environments 14 Logical Decision Making If Equal(?Day,Monday) & GreaterThan(?CurrentTime,0600) & LessThan(?CurrentTime,0700) & Location(Bob,bedroom,?CurrentTime) & Increment(?CurrentTime,?NextTime) Then Location(Bob,bathroom,?NextTime) Query: Location(Bob,?Room,0800) Intelligent Environments 15 Logical Decision Making Rules and facts Inference mechanism First-order predicate logic Deduction: {A, A B} B Systems Prolog (PROgramming in LOGic) OTTER Theorem Prover Intelligent Environments 16 Prolog location(bob,bathroom,NextTime) :dayofweek(Day), Day = monday, currenttime(CurrentTime), CurrentTime > 0600, CurrentTime < 0700, location(bob,bedroom,CurrentTime), increment(CurrentTime,NextTime). Facts: dayofweek(monday), ... Query: location(bob,Room,0800). Intelligent Environments 17 OTTER (all d all t1 all t2 ((DayofWeek(d) & Equal(d,Monday) & CurrentTime(t1) & GreaterThan(t1,0600) & LessThan(t1,0700) & NextTime(t1,t2) & Location(Bob,Bedroom,t1)) -> Location(Bob,Bathroom,t2))). Facts: DayofWeek(Monday), ... Query: (exists r (Location(Bob,r,0800))) Intelligent Environments 18 Actions If Location(Bob,Bathroom,t1) Then Action(TurnOnBathRoomLight,t1) Preferences among actions If RecommendedAction(a1,t1) & RecommendedAction(a2,t1) & ActionPriority(a1) > ActionPriority(a2) Then Action(a1,t1) Intelligent Environments 19 Persistence Over Time If Location(Bob,room1,t1) & not Move(Bob,t1) & NextTime(t1,t2) Then Location(Bob,room1,t2) One for each attribute of Bob! Intelligent Environments 20 Logical Decision Making Assessment Complete? Yes Correct? Yes Efficient? No Natural? No Rational? Intelligent Environments 21 Decision Making as Planning Search for a sequence of actions to achieve some goal Requires Initial state of the environment Goal state Actions (operators) Conditions Effects (implied connection to effectors) Intelligent Environments 22 Example Initial: location(Bob,Bathroom) & light(Bathroom,off) Goal: happy(Bob) Action 1 Action 2 Condition: location(Bob,?r) & light(?r,on) Effect: Add: happy(Bob) Condition: light(?r,off) Effect: Delete: light(?r,off), Add: light(?r,on) Plan: Action 2, Action 1 Intelligent Environments 23 Requirements Where do goals come from? System design Users Where do actions come from? Device “drivers” Learned macros E.g., SecureHome action Intelligent Environments 24 Planning Systems UCPOP (Univ. of Washington) Partial Order Planner with Universal quanitification and Conditional effects GraphPlan (CMU) Builds and prunes graph of possible plans Intelligent Environments 25 GraphPlan Example (:action lighton :parameters (?r) :precondition (light ?r off)) :effects (and (light ?r on) (not (light ?r off)))) Intelligent Environments 26 Planning Assessment Complete? Yes Correct? Yes Efficient? No Natural? Better Rational? Intelligent Environments 27 Decision Theory Logical and planning approaches typically assume no uncertainty Decision theory = probability theory + utility theory Maximum Expected Utility principle Rational agent chooses actions yielding highest expected utility Averaged over all possible action outcomes Weight utility of an outcome by its probability of occurring Intelligent Environments 28 Probability Theory Random variables: X, Y, … Prior probability: P(X) Conditional probability: P(X|Y) Joint probability distribution P(X1,…,Xn) is an n-dimensional table of probabilities Complete table allows computation of any probability Complete table typically infeasible Intelligent Environments 29 Probability Theory Bayes rule P( X | Y ) P(Y | X ) P( X ) P(Y ) Example P( wet | rain ) P(rain ) P(rain | wet ) P( wet ) More likely to know P(wet|rain) In general, P(X|Y) = * P(Y|X) * P(Y) chosen so that P(X|Y) = 1 Intelligent Environments 30 Bayes Rule (cont.) How to compute P(rain|wet & thunder) P(r | w & t) = P(w & t | r) * P(r) / P(w & t) Know P(w & t | r) possibly, but tedious as evidence increases Conditional independence of evidence Thunder does not cause wet, and vice versa P(r | w & t) = * P(w|r) * P(t|r) * P(r) Intelligent Environments 31 Where Do Probabilities Come From? Statistical sampling Universal principles Individual beliefs Intelligent Environments 32 Representation of Uncertain Knowledge Complete joint probability distribution Conditional probabilities and Bayes rule Assuming conditional independence Belief networks Intelligent Environments 33 Belief Networks Nodes represent random variables Directed link between X and Y implies that X “directly influences” Y Each node has a conditional probability table (CPT) quantifying the effects that the parents (incoming links) have on the node Network is a DAG (no directed cycles) Intelligent Environments 34 Belief Networks: Example Intelligent Environments 35 Belief Networks: Semantics Network represents the joint probability distribution n P( X 1 x1 ,..., X n xn ) P( x1 ,..., xn ) P( xi | Parents( X i )) i 1 Network encodes conditional independence knowledge Node conditionally independent of all other nodes except parents E.g., MaryCalls and Earthquake are conditionally independent Intelligent Environments 36 Belief Networks: Inference Given network, compute P(Query | Evidence) Evidence obtained from sensory percepts Possible inferences Diagnostic: P(Burglary | JohnCalls) = 0.016 Causal: P(JohnCalls | Burglary) P(Burglary | Alarm & Earthquake) Intelligent Environments 37 Belief Network Construction Choose variables Order variables from causes to effects CPTs Discretize continuous variables Specify each table entry Define as a function (e.g., sum, Gaussian) Learning Variables (evidential and hidden) Links (causation) CPTs Intelligent Environments 38 Combining Beliefs with Desires Maximum expected utility Rational agent chooses action maximizing expected utility Expected utility EU(A|E) of action A given evidence E EU(A|E) = i P(Resulti(A) | E, Do(A)) * U(Resulti(A)) Resulti(A) are possible outcome states after executing action A U(S) is the agent’s utility for state S Do(A) is the proposition that action A is executed in the current state Intelligent Environments 39 Maximum Expected Utility Assumptions Knowing evidence E completely requires significant sensory information P(Result | E, Do(A)) requires complete causal model of the environment U(Result) requires complete specification of state utilities One-shot vs. sequential decisions Intelligent Environments 40 Utility Theory Any set of preferences over possible outcomes can be expressed by a utility function Lottery L = [p1,S1; p2,S2; ...; pn,Sn] Utility principle pi is the probability of possible outcome Si Si can be another lottery U(A) > U(B) A preferred to B U(A) = U(B) agent indifferent to A and B Maximum expected utility principle U([p1,S1; p2,S2; ...; pn,Sn]) = i pi * U(Si) Intelligent Environments 41 Utility Functions Possible outcomes Expected monetary value [1.0, $1000; 0.0, $0] [0.5, $3000; 0.5, $0] $1000 vs. $1500 But depends on value $k Sk = state of possessing wealth $k EU(accept) = 0.5 * U(Sk+3000) + 0.5 * U(Sk) EU(decline) = U(Sk+1000) Will decline for some values of U, accept for others Intelligent Environments 42 Utility Functions (cont.) Studies show U(Sk+n) = log2n Risk-adverse agents in positive part of curve Risk-seeking agents in negative part of curve Intelligent Environments 43 Decision Networks Also called influence diagrams Decision networks = belief networks + actions and utilities Describes agent’s Current state Possible actions State resulting from agent’s action Utility of resulting state Intelligent Environments 44 Example Decision Network Intelligent Environments 45 Decision Network Chance node (oval) Decision node (rectangle) Random variable and CPT Same as belief network node Can take on a value for each possible action Utility node (diamond) Parents are those chance nodes affecting utility Contains utility function mapping parents to utility value or lottery Intelligent Environments 46 Evaluating Decision Networks Set evidence variables according to current state For each action value of decision node Set value of decision node to action Use belief-net inference to calculate posteriors for parents of utility node Calculate utility for action Return action with highest utility Intelligent Environments 47 Sequential Decision Problems No intermediate utility on the way to the goal a Transition model M ij Probability of reaching state j after taking action a in state i Policy = complete mapping from states to actions Want policy maximizing expected utility Computed from transition model and state utilities Intelligent Environments 48 Example P(intended direction) = 0.8 P(right angle to intended) = 0.1 U(sequence) = terminal state’s value - (1/25)*length(sequence) Intelligent Environments 49 Example (cont.) Optimal Policy Utilities Intelligent Environments 50 Markov Decision Process (MDP) Calculating optimal policy in fullyobservable, stochastic environment with known transition model M ija Markov property satisfied a ij depends only on i and not previous states M Partially-observable environments addressed by POMDPs Intelligent Environments 51 Value Iteration for MDPs Iterate the following for each state i until little change U (i ) R(i ) max M ija U ( j ) a R(i) is the reward for entering state i j -0.04 for all states except (4,3) and (4,2) +1 for (4,3) -1 for (4,2) Best policy policy*(i) is policy * (i ) arg max M ija U ( j ) a j Intelligent Environments 52 Reinforcement Learning Basically MDP, but learns policy without the a need for transition model M ij Q-learning with temporal difference Assigns values Q(a,i) to action-state pairs Utility U(i) = maxa Q(a,i) Update Q(a,i) after each observed transition from state i to state j Q(a,i) = Q(a,i) + * (R(i) + maxa’ Q(a’,j) - Q(a,i)) action in state i = argmaxa Q(a,i) Intelligent Environments 53 Decision-Theoretic Agent Given Maintain Decision network with beliefs, actions and utilities Do Percept (sensor) information Update probabilities for current state Compute outcome probabilities for actions Select action with highest expected utility Return action Intelligent Environments 54 Decision-Theoretic Agent Modeling sensors Intelligent Environments 55 Sensor Modeling Combining evidence from multiple sensors Intelligent Environments 56 Sensor Modeling Detailed model of lane-position sensor Intelligent Environments 57 Dynamic Belief Network (DBN) Reasoning over time Big for lots of states But really only need two slices at a time Intelligent Environments 58 Dynamic Belief Network (DBN) Intelligent Environments 59 DBN for Lane Positioning Intelligent Environments 60 Dynamic Decision Network (DDN) Intelligent Environments 61 DDN-based Agent Capabilities Handles uncertainty Handles unexpected events (no fixed plan) Handles noisy and failed sensors Acts to obtain relevant information Needs Properties from first-order logic DDNs are propositional Goal directedness Intelligent Environments 62 Decision-Theoretic Agent Assessment Complete? No Correct? No Efficient? Better Natural? Yes Rational? Yes Intelligent Environments 63 Netica www.norsys.com Decision network simulator Chance nodes Decision nodes Utility nodes Learns probabilities from cases Intelligent Environments 64 Bob Scenario in Netica Intelligent Environments 65 Issues in Decision Making Rational agent design Dynamic decision-theoretic agent Knowledge engineering effort Efficiency vs. completeness Monolithic vs. distributed intelligence Degrees of autonomy Intelligent Environments 66