Download Intelligent Environments

Document related concepts

Embodied language processing wikipedia , lookup

Heritability of IQ wikipedia , lookup

Artificial intelligence wikipedia , lookup

Twin study wikipedia , lookup

Artificial general intelligence wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

Inductive probability wikipedia , lookup

Environmental enrichment wikipedia , lookup

Neuroeconomics wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Ethics of artificial intelligence wikipedia , lookup

Embodied cognitive science wikipedia , lookup

Transcript
Intelligent Environments
Computer Science and Engineering
University of Texas at Arlington
Intelligent Environments
1
Decision-Making for
Intelligent Environments



Motivation
Techniques
Issues
Intelligent Environments
2
Motivation

An intelligent environment acquires and
applies knowledge about you and your
surroundings in order to improve your
experience.


“acquires”  prediction
“applies”  decision making
Intelligent Environments
3
Motivation

Why do we need decision-making?


“Improve our experience”
Usually alternative actions


Which one to take?
Example (Bob scenario: bedroom  ?)



Turn on bathroom light?
Turn on kitchen light?
Turn off bedroom light?
Intelligent Environments
4
Example


Should I turn on the bathroom light?
Issues






Inhabitant’s location (current and future)
Inhabitant’s task
Inhabitant’s preferences
Energy efficiency
Security
Other inhabitants
Intelligent Environments
5
Qualities of a Decision Maker

Ideal





Complete: always makes a decision
Correct: decision is always right
Natural: knowledge easily expressed
Efficient
Rational

Decisions made to maximize performance
Intelligent Environments
6
Agent-based Decision Maker


Russell & Norvig “AI: A Modern Approach”
Rational agent

Agent chooses an action to maximize its
performance based on percept sequence
Intelligent Environments
7
Agent Types




Reflex agent
Reflex agent with state
Goal-based agent
Utility-based agent
Intelligent Environments
8
Reflex Agent
Intelligent Environments
9
Reflex Agent with State
Intelligent Environments
10
Goal-based Agent
Intelligent Environments
11
Utility-based Agent
Intelligent Environments
12
Intelligent Environments
Decision-Making Techniques
Intelligent Environments
13
Decision-Making Techniques





Logic
Planning
Decision theory
Markov decision process
Reinforcement learning
Intelligent Environments
14
Logical Decision Making


If Equal(?Day,Monday)
& GreaterThan(?CurrentTime,0600)
& LessThan(?CurrentTime,0700)
& Location(Bob,bedroom,?CurrentTime)
& Increment(?CurrentTime,?NextTime)
Then Location(Bob,bathroom,?NextTime)
Query: Location(Bob,?Room,0800)
Intelligent Environments
15
Logical Decision Making

Rules and facts


Inference mechanism


First-order predicate logic
Deduction: {A, A  B}  B
Systems


Prolog (PROgramming in LOGic)
OTTER Theorem Prover
Intelligent Environments
16
Prolog



location(bob,bathroom,NextTime) :dayofweek(Day),
Day = monday,
currenttime(CurrentTime),
CurrentTime > 0600,
CurrentTime < 0700,
location(bob,bedroom,CurrentTime),
increment(CurrentTime,NextTime).
Facts: dayofweek(monday), ...
Query: location(bob,Room,0800).
Intelligent Environments
17
OTTER



(all d all t1 all t2
((DayofWeek(d) & Equal(d,Monday) &
CurrentTime(t1) &
GreaterThan(t1,0600) &
LessThan(t1,0700) & NextTime(t1,t2)
& Location(Bob,Bedroom,t1)) ->
Location(Bob,Bathroom,t2))).
Facts: DayofWeek(Monday), ...
Query: (exists r (Location(Bob,r,0800)))
Intelligent Environments
18
Actions


If Location(Bob,Bathroom,t1)
Then Action(TurnOnBathRoomLight,t1)
Preferences among actions

If RecommendedAction(a1,t1) &
RecommendedAction(a2,t1) &
ActionPriority(a1) > ActionPriority(a2)
Then Action(a1,t1)
Intelligent Environments
19
Persistence Over Time


If Location(Bob,room1,t1)
& not Move(Bob,t1)
& NextTime(t1,t2)
Then Location(Bob,room1,t2)
One for each attribute of Bob!
Intelligent Environments
20
Logical Decision Making

Assessment





Complete? Yes
Correct? Yes
Efficient? No
Natural? No
Rational?
Intelligent Environments
21
Decision Making as Planning


Search for a sequence of actions to
achieve some goal
Requires



Initial state of the environment
Goal state
Actions (operators)


Conditions
Effects (implied connection to effectors)
Intelligent Environments
22
Example



Initial: location(Bob,Bathroom) & light(Bathroom,off)
Goal: happy(Bob)
Action 1


Action 2



Condition: location(Bob,?r) & light(?r,on)
Effect: Add: happy(Bob)
Condition: light(?r,off)
Effect: Delete: light(?r,off), Add: light(?r,on)
Plan: Action 2, Action 1
Intelligent Environments
23
Requirements

Where do goals come from?



System design
Users
Where do actions come from?


Device “drivers”
Learned macros

E.g., SecureHome action
Intelligent Environments
24
Planning Systems

UCPOP (Univ. of Washington)


Partial Order Planner with Universal
quanitification and Conditional effects
GraphPlan (CMU)

Builds and prunes graph of possible plans
Intelligent Environments
25
GraphPlan Example
(:action lighton
:parameters (?r)
:precondition
(light ?r off))
:effects
(and (light ?r on)
(not (light ?r off))))
Intelligent Environments
26
Planning

Assessment





Complete? Yes
Correct? Yes
Efficient? No
Natural? Better
Rational?
Intelligent Environments
27
Decision Theory



Logical and planning approaches typically
assume no uncertainty
Decision theory = probability theory + utility
theory
Maximum Expected Utility principle

Rational agent chooses actions yielding highest
expected utility


Averaged over all possible action outcomes
Weight utility of an outcome by its probability of
occurring
Intelligent Environments
28
Probability Theory




Random variables: X, Y, …
Prior probability: P(X)
Conditional probability: P(X|Y)
Joint probability distribution



P(X1,…,Xn) is an n-dimensional table of
probabilities
Complete table allows computation of any
probability
Complete table typically infeasible
Intelligent Environments
29
Probability Theory

Bayes rule
P( X | Y ) 

P(Y | X ) P( X )
P(Y )
Example
P( wet | rain ) P(rain )
P(rain | wet ) 
P( wet )


More likely to know P(wet|rain)
In general, P(X|Y) =  * P(Y|X) * P(Y)

 chosen so that  P(X|Y) = 1
Intelligent Environments
30
Bayes Rule (cont.)

How to compute P(rain|wet & thunder)



P(r | w & t) = P(w & t | r) * P(r) / P(w & t)
Know P(w & t | r) possibly, but tedious as
evidence increases
Conditional independence of evidence


Thunder does not cause wet, and vice
versa
P(r | w & t) =  * P(w|r) * P(t|r) * P(r)
Intelligent Environments
31
Where Do Probabilities Come
From?



Statistical sampling
Universal principles
Individual beliefs
Intelligent Environments
32
Representation of Uncertain
Knowledge


Complete joint probability distribution
Conditional probabilities and Bayes rule


Assuming conditional independence
Belief networks
Intelligent Environments
33
Belief Networks




Nodes represent random variables
Directed link between X and Y implies
that X “directly influences” Y
Each node has a conditional probability
table (CPT) quantifying the effects that
the parents (incoming links) have on
the node
Network is a DAG (no directed cycles)
Intelligent Environments
34
Belief Networks: Example
Intelligent Environments
35
Belief Networks: Semantics

Network represents the joint probability
distribution
n
P( X 1  x1 ,..., X n  xn )  P( x1 ,..., xn )   P( xi | Parents( X i ))
i 1

Network encodes conditional independence
knowledge


Node conditionally independent of all other nodes
except parents
E.g., MaryCalls and Earthquake are conditionally
independent
Intelligent Environments
36
Belief Networks: Inference

Given network, compute
P(Query | Evidence)


Evidence obtained from sensory percepts
Possible inferences



Diagnostic: P(Burglary | JohnCalls) = 0.016
Causal: P(JohnCalls | Burglary)
P(Burglary | Alarm & Earthquake)
Intelligent Environments
37
Belief Network Construction

Choose variables



Order variables from causes to effects
CPTs



Discretize continuous variables
Specify each table entry
Define as a function (e.g., sum, Gaussian)
Learning



Variables (evidential and hidden)
Links (causation)
CPTs
Intelligent Environments
38
Combining Beliefs with Desires

Maximum expected utility


Rational agent chooses action maximizing
expected utility
Expected utility EU(A|E) of action A given
evidence E

EU(A|E) = i P(Resulti(A) | E, Do(A)) * U(Resulti(A))



Resulti(A) are possible outcome states after executing
action A
U(S) is the agent’s utility for state S
Do(A) is the proposition that action A is executed in the
current state
Intelligent Environments
39
Maximum Expected Utility

Assumptions




Knowing evidence E completely requires
significant sensory information
P(Result | E, Do(A)) requires complete
causal model of the environment
U(Result) requires complete specification of
state utilities
One-shot vs. sequential decisions
Intelligent Environments
40
Utility Theory


Any set of preferences over possible outcomes can be
expressed by a utility function
Lottery L = [p1,S1; p2,S2; ...; pn,Sn]



Utility principle



pi is the probability of possible outcome Si
Si can be another lottery
U(A) > U(B)  A preferred to B
U(A) = U(B)  agent indifferent to A and B
Maximum expected utility principle

U([p1,S1; p2,S2; ...; pn,Sn]) = i pi * U(Si)
Intelligent Environments
41
Utility Functions

Possible outcomes



Expected monetary value


[1.0, $1000; 0.0, $0]
[0.5, $3000; 0.5, $0]
$1000 vs. $1500
But depends on value $k




Sk = state of possessing wealth $k
EU(accept) = 0.5 * U(Sk+3000) + 0.5 * U(Sk)
EU(decline) = U(Sk+1000)
Will decline for some values of U, accept for
others
Intelligent Environments
42
Utility Functions (cont.)

Studies show U(Sk+n) = log2n


Risk-adverse agents in positive part of curve
Risk-seeking agents in negative part of curve
Intelligent Environments
43
Decision Networks



Also called influence diagrams
Decision networks = belief networks +
actions and utilities
Describes agent’s




Current state
Possible actions
State resulting from agent’s action
Utility of resulting state
Intelligent Environments
44
Example Decision Network
Intelligent Environments
45
Decision Network

Chance node (oval)



Decision node (rectangle)


Random variable and CPT
Same as belief network node
Can take on a value for each possible action
Utility node (diamond)


Parents are those chance nodes affecting utility
Contains utility function mapping parents to utility
value or lottery
Intelligent Environments
46
Evaluating Decision Networks


Set evidence variables according to
current state
For each action value of decision node




Set value of decision node to action
Use belief-net inference to calculate
posteriors for parents of utility node
Calculate utility for action
Return action with highest utility
Intelligent Environments
47
Sequential Decision Problems


No intermediate utility on the way to the goal
a
Transition model M ij


Probability of reaching state j after taking action a
in state i
Policy = complete mapping from states to
actions


Want policy maximizing expected utility
Computed from transition model and state utilities
Intelligent Environments
48
Example



P(intended direction) = 0.8
P(right angle to intended) = 0.1
U(sequence) = terminal state’s value - (1/25)*length(sequence)
Intelligent Environments
49
Example (cont.)
Optimal Policy
Utilities
Intelligent Environments
50
Markov Decision Process
(MDP)


Calculating optimal policy in fullyobservable, stochastic environment with
known transition model M ija
Markov property satisfied


a
ij
depends only on i and not previous
states
M
Partially-observable environments
addressed by POMDPs
Intelligent Environments
51
Value Iteration for MDPs

Iterate the following for each state i until little
change
U (i )  R(i )  max  M ija  U ( j )
a

R(i) is the reward for entering state i




j
-0.04 for all states except (4,3) and (4,2)
+1 for (4,3)
-1 for (4,2)
Best policy policy*(i) is
policy * (i )  arg max  M ija  U ( j )
a
j
Intelligent Environments
52
Reinforcement Learning


Basically MDP, but learns policy without the
a
need for transition model M ij
Q-learning with temporal difference
Assigns values Q(a,i) to action-state pairs
 Utility U(i) = maxa Q(a,i)
 Update Q(a,i) after each observed transition from
state i to state j
Q(a,i) = Q(a,i) +  * (R(i) + maxa’ Q(a’,j) - Q(a,i))
action in state i = argmaxa Q(a,i)

Intelligent Environments
53
Decision-Theoretic Agent

Given


Maintain


Decision network with beliefs, actions and utilities
Do




Percept (sensor) information
Update probabilities for current state
Compute outcome probabilities for actions
Select action with highest expected utility
Return action
Intelligent Environments
54
Decision-Theoretic Agent

Modeling sensors
Intelligent Environments
55
Sensor Modeling

Combining evidence from multiple
sensors
Intelligent Environments
56
Sensor Modeling

Detailed model of lane-position sensor
Intelligent Environments
57
Dynamic Belief Network (DBN)



Reasoning over time
Big for lots of states
But really only need two slices at a time
Intelligent Environments
58
Dynamic Belief Network (DBN)
Intelligent Environments
59
DBN for Lane Positioning
Intelligent Environments
60
Dynamic Decision Network
(DDN)
Intelligent Environments
61
DDN-based Agent

Capabilities





Handles uncertainty
Handles unexpected events (no fixed plan)
Handles noisy and failed sensors
Acts to obtain relevant information
Needs

Properties from first-order logic


DDNs are propositional
Goal directedness
Intelligent Environments
62
Decision-Theoretic Agent

Assessment





Complete? No
Correct? No
Efficient? Better
Natural? Yes
Rational? Yes
Intelligent Environments
63
Netica


www.norsys.com
Decision network simulator




Chance nodes
Decision nodes
Utility nodes
Learns probabilities from cases
Intelligent Environments
64
Bob Scenario in Netica
Intelligent Environments
65
Issues in Decision Making

Rational agent design





Dynamic decision-theoretic agent
Knowledge engineering effort
Efficiency vs. completeness
Monolithic vs. distributed intelligence
Degrees of autonomy
Intelligent Environments
66