* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Intelligent Environments
Embodied language processing wikipedia , lookup
Heritability of IQ wikipedia , lookup
Artificial intelligence wikipedia , lookup
Artificial general intelligence wikipedia , lookup
Philosophy of artificial intelligence wikipedia , lookup
Inductive probability wikipedia , lookup
Environmental enrichment wikipedia , lookup
Neuroeconomics wikipedia , lookup
History of artificial intelligence wikipedia , lookup
Intelligent Environments
Computer Science and Engineering
University of Texas at Arlington
Intelligent Environments
1
Decision-Making for
Intelligent Environments
Motivation
Techniques
Issues
Intelligent Environments
2
Motivation
An intelligent environment acquires and
applies knowledge about you and your
surroundings in order to improve your
experience.
“acquires” prediction
“applies” decision making
Intelligent Environments
3
Motivation
Why do we need decision-making?
“Improve our experience”
Usually alternative actions
Which one to take?
Example (Bob scenario: bedroom ?)
Turn on bathroom light?
Turn on kitchen light?
Turn off bedroom light?
Intelligent Environments
4
Example
Should I turn on the bathroom light?
Issues
Inhabitant’s location (current and future)
Inhabitant’s task
Inhabitant’s preferences
Energy efficiency
Security
Other inhabitants
Intelligent Environments
5
Qualities of a Decision Maker
Ideal
Complete: always makes a decision
Correct: decision is always right
Natural: knowledge easily expressed
Efficient
Rational
Decisions made to maximize performance
Intelligent Environments
6
Agent-based Decision Maker
Russell & Norvig “AI: A Modern Approach”
Rational agent
Agent chooses an action to maximize its
performance based on percept sequence
Intelligent Environments
7
Agent Types
Reflex agent
Reflex agent with state
Goal-based agent
Utility-based agent
Intelligent Environments
8
Reflex Agent
Intelligent Environments
9
Reflex Agent with State
Intelligent Environments
10
Goal-based Agent
Intelligent Environments
11
Utility-based Agent
Intelligent Environments
12
Intelligent Environments
Decision-Making Techniques
Intelligent Environments
13
Decision-Making Techniques
Logic
Planning
Decision theory
Markov decision process
Reinforcement learning
Intelligent Environments
14
Logical Decision Making
If Equal(?Day,Monday)
& GreaterThan(?CurrentTime,0600)
& LessThan(?CurrentTime,0700)
& Location(Bob,bedroom,?CurrentTime)
& Increment(?CurrentTime,?NextTime)
Then Location(Bob,bathroom,?NextTime)
Query: Location(Bob,?Room,0800)
Intelligent Environments
15
Logical Decision Making
Rules and facts
Inference mechanism
First-order predicate logic
Deduction: {A, A B} B
Systems
Prolog (PROgramming in LOGic)
OTTER Theorem Prover
Intelligent Environments
16
Prolog
location(bob,bathroom,NextTime) :dayofweek(Day),
Day = monday,
currenttime(CurrentTime),
CurrentTime > 0600,
CurrentTime < 0700,
location(bob,bedroom,CurrentTime),
increment(CurrentTime,NextTime).
Facts: dayofweek(monday), ...
Query: location(bob,Room,0800).
Intelligent Environments
17
OTTER
(all d all t1 all t2
((DayofWeek(d) & Equal(d,Monday) &
CurrentTime(t1) &
GreaterThan(t1,0600) &
LessThan(t1,0700) & NextTime(t1,t2)
& Location(Bob,Bedroom,t1)) ->
Location(Bob,Bathroom,t2))).
Facts: DayofWeek(Monday), ...
Query: (exists r (Location(Bob,r,0800)))
Intelligent Environments
18
Actions
If Location(Bob,Bathroom,t1)
Then Action(TurnOnBathRoomLight,t1)
Preferences among actions
If RecommendedAction(a1,t1) &
RecommendedAction(a2,t1) &
ActionPriority(a1) > ActionPriority(a2)
Then Action(a1,t1)
Intelligent Environments
19
Persistence Over Time
If Location(Bob,room1,t1)
& not Move(Bob,t1)
& NextTime(t1,t2)
Then Location(Bob,room1,t2)
One for each attribute of Bob!
Intelligent Environments
20
Logical Decision Making
Assessment
Complete? Yes
Correct? Yes
Efficient? No
Natural? No
Rational?
Intelligent Environments
21
Decision Making as Planning
Search for a sequence of actions to
achieve some goal
Requires
Initial state of the environment
Goal state
Actions (operators)
Conditions
Effects (implied connection to effectors)
Intelligent Environments
22
Example
Initial: location(Bob,Bathroom) & light(Bathroom,off)
Goal: happy(Bob)
Action 1
Action 2
Condition: location(Bob,?r) & light(?r,on)
Effect: Add: happy(Bob)
Condition: light(?r,off)
Effect: Delete: light(?r,off), Add: light(?r,on)
Plan: Action 2, Action 1
Intelligent Environments
23
Requirements
Where do goals come from?
System design
Users
Where do actions come from?
Device “drivers”
Learned macros
E.g., SecureHome action
Intelligent Environments
24
Planning Systems
UCPOP (Univ. of Washington)
Partial Order Planner with Universal
quanitification and Conditional effects
GraphPlan (CMU)
Builds and prunes graph of possible plans
Intelligent Environments
25
GraphPlan Example
(:action lighton
:parameters (?r)
:precondition
(light ?r off))
:effects
(and (light ?r on)
(not (light ?r off))))
Intelligent Environments
26
Planning
Assessment
Complete? Yes
Correct? Yes
Efficient? No
Natural? Better
Rational?
Intelligent Environments
27
Decision Theory
Logical and planning approaches typically
assume no uncertainty
Decision theory = probability theory + utility
theory
Maximum Expected Utility principle
Rational agent chooses actions yielding highest
expected utility
Averaged over all possible action outcomes
Weight utility of an outcome by its probability of
occurring
Intelligent Environments
28
Probability Theory
Random variables: X, Y, …
Prior probability: P(X)
Conditional probability: P(X|Y)
Joint probability distribution
P(X1,…,Xn) is an n-dimensional table of
probabilities
Complete table allows computation of any
probability
Complete table typically infeasible
Intelligent Environments
29
Probability Theory
Bayes rule
P( X | Y )
P(Y | X ) P( X )
P(Y )
Example
P( wet | rain ) P(rain )
P(rain | wet )
P( wet )
More likely to know P(wet|rain)
In general, P(X|Y) = * P(Y|X) * P(Y)
chosen so that P(X|Y) = 1
Intelligent Environments
30
Bayes Rule (cont.)
How to compute P(rain|wet & thunder)
P(r | w & t) = P(w & t | r) * P(r) / P(w & t)
Know P(w & t | r) possibly, but tedious as
evidence increases
Conditional independence of evidence
Thunder does not cause wet, and vice
versa
P(r | w & t) = * P(w|r) * P(t|r) * P(r)
Intelligent Environments
31
Where Do Probabilities Come
From?
Statistical sampling
Universal principles
Individual beliefs
Intelligent Environments
32
Representation of Uncertain
Knowledge
Complete joint probability distribution
Conditional probabilities and Bayes rule
Assuming conditional independence
Belief networks
Intelligent Environments
33
Belief Networks
Nodes represent random variables
Directed link between X and Y implies
that X “directly influences” Y
Each node has a conditional probability
table (CPT) quantifying the effects that
the parents (incoming links) have on
the node
Network is a DAG (no directed cycles)
Intelligent Environments
34
Belief Networks: Example
Intelligent Environments
35
Belief Networks: Semantics
Network represents the joint probability
distribution
n
P( X 1 x1 ,..., X n xn ) P( x1 ,..., xn ) P( xi | Parents( X i ))
i 1
Network encodes conditional independence
knowledge
Node conditionally independent of all other nodes
except parents
E.g., MaryCalls and Earthquake are conditionally
independent
Intelligent Environments
36
Belief Networks: Inference
Given network, compute
P(Query | Evidence)
Evidence obtained from sensory percepts
Possible inferences
Diagnostic: P(Burglary | JohnCalls) = 0.016
Causal: P(JohnCalls | Burglary)
P(Burglary | Alarm & Earthquake)
Intelligent Environments
37
Belief Network Construction
Choose variables
Order variables from causes to effects
CPTs
Discretize continuous variables
Specify each table entry
Define as a function (e.g., sum, Gaussian)
Learning
Variables (evidential and hidden)
Links (causation)
CPTs
Intelligent Environments
38
Combining Beliefs with Desires
Maximum expected utility
Rational agent chooses action maximizing
expected utility
Expected utility EU(A|E) of action A given
evidence E
EU(A|E) = i P(Resulti(A) | E, Do(A)) * U(Resulti(A))
Resulti(A) are possible outcome states after executing
action A
U(S) is the agent’s utility for state S
Do(A) is the proposition that action A is executed in the
current state
Intelligent Environments
39
Maximum Expected Utility
Assumptions
Knowing evidence E completely requires
significant sensory information
P(Result | E, Do(A)) requires complete
causal model of the environment
U(Result) requires complete specification of
state utilities
One-shot vs. sequential decisions
Intelligent Environments
40
Utility Theory
Any set of preferences over possible outcomes can be
expressed by a utility function
Lottery L = [p1,S1; p2,S2; ...; pn,Sn]
Utility principle
pi is the probability of possible outcome Si
Si can be another lottery
U(A) > U(B) A preferred to B
U(A) = U(B) agent indifferent to A and B
Maximum expected utility principle
U([p1,S1; p2,S2; ...; pn,Sn]) = i pi * U(Si)
Intelligent Environments
41
Utility Functions
Possible outcomes
Expected monetary value
[1.0, $1000; 0.0, $0]
[0.5, $3000; 0.5, $0]
$1000 vs. $1500
But depends on value $k
Sk = state of possessing wealth $k
EU(accept) = 0.5 * U(Sk+3000) + 0.5 * U(Sk)
EU(decline) = U(Sk+1000)
Will decline for some values of U, accept for
others
Intelligent Environments
42
Utility Functions (cont.)
Studies show U(Sk+n) = log2n
Risk-adverse agents in positive part of curve
Risk-seeking agents in negative part of curve
Intelligent Environments
43
Decision Networks
Also called influence diagrams
Decision networks = belief networks +
actions and utilities
Describes agent’s
Current state
Possible actions
State resulting from agent’s action
Utility of resulting state
Intelligent Environments
44
Example Decision Network
Intelligent Environments
45
Decision Network
Chance node (oval)
Decision node (rectangle)
Random variable and CPT
Same as belief network node
Can take on a value for each possible action
Utility node (diamond)
Parents are those chance nodes affecting utility
Contains utility function mapping parents to utility
value or lottery
Intelligent Environments
46
Evaluating Decision Networks
Set evidence variables according to
current state
For each action value of decision node
Set value of decision node to action
Use belief-net inference to calculate
posteriors for parents of utility node
Calculate utility for action
Return action with highest utility
Intelligent Environments
47
Sequential Decision Problems
No intermediate utility on the way to the goal
a
Transition model M ij
Probability of reaching state j after taking action a
in state i
Policy = complete mapping from states to
actions
Want policy maximizing expected utility
Computed from transition model and state utilities
Intelligent Environments
48
Example
P(intended direction) = 0.8
P(right angle to intended) = 0.1
U(sequence) = terminal state’s value - (1/25)*length(sequence)
Intelligent Environments
49
Example (cont.)
Optimal Policy
Utilities
Intelligent Environments
50
Markov Decision Process
(MDP)
Calculating optimal policy in fullyobservable, stochastic environment with
known transition model M ija
Markov property satisfied
a
ij
depends only on i and not previous
states
M
Partially-observable environments
addressed by POMDPs
Intelligent Environments
51
Value Iteration for MDPs
Iterate the following for each state i until little
change
U (i ) R(i ) max M ija U ( j )
a
R(i) is the reward for entering state i
j
-0.04 for all states except (4,3) and (4,2)
+1 for (4,3)
-1 for (4,2)
Best policy policy*(i) is
policy * (i ) arg max M ija U ( j )
a
j
Intelligent Environments
52
Reinforcement Learning
Basically MDP, but learns policy without the
a
need for transition model M ij
Q-learning with temporal difference
Assigns values Q(a,i) to action-state pairs
Utility U(i) = maxa Q(a,i)
Update Q(a,i) after each observed transition from
state i to state j
Q(a,i) = Q(a,i) + * (R(i) + maxa’ Q(a’,j) - Q(a,i))
action in state i = argmaxa Q(a,i)
Intelligent Environments
53
Decision-Theoretic Agent
Given
Maintain
Decision network with beliefs, actions and utilities
Do
Percept (sensor) information
Update probabilities for current state
Compute outcome probabilities for actions
Select action with highest expected utility
Return action
Intelligent Environments
54
Decision-Theoretic Agent
Modeling sensors
Intelligent Environments
55
Sensor Modeling
Combining evidence from multiple
sensors
Intelligent Environments
56
Sensor Modeling
Detailed model of lane-position sensor
Intelligent Environments
57
Dynamic Belief Network (DBN)
Reasoning over time
Big for lots of states
But really only need two slices at a time
Intelligent Environments
58
Dynamic Belief Network (DBN)
Intelligent Environments
59
DBN for Lane Positioning
Intelligent Environments
60
Dynamic Decision Network
(DDN)
Intelligent Environments
61
DDN-based Agent
Capabilities
Handles uncertainty
Handles unexpected events (no fixed plan)
Handles noisy and failed sensors
Acts to obtain relevant information
Needs
Properties from first-order logic
DDNs are propositional
Goal directedness
Intelligent Environments
62
Decision-Theoretic Agent
Assessment
Complete? No
Correct? No
Efficient? Better
Natural? Yes
Rational? Yes
Intelligent Environments
63
Netica
www.norsys.com
Decision network simulator
Chance nodes
Decision nodes
Utility nodes
Learns probabilities from cases
Intelligent Environments
64
Bob Scenario in Netica
Intelligent Environments
65
Issues in Decision Making
Rational agent design
Dynamic decision-theoretic agent
Knowledge engineering effort
Efficiency vs. completeness
Monolithic vs. distributed intelligence
Degrees of autonomy
Intelligent Environments
66