* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 19
Survey
Document related concepts
Transcript
Chapter 19 Statistical Decision Theory © Framework for a Decision Problem i. Decision maker has available K possible courses of action: a1, a2, . . ., aK. Actions are sometime called alternatives. ii. There are H possible uncertain states of nature: s1, s2, . . ., sH. States of nature are the possible outcomes over which the decision maker has no control. Sometimes states of nature are called events. iii. For each possible action-state of nature combination, there is an associated outcome representing either profit or loss, called the monetary payoff, Mij, that corresponds to action ai and state of nature sj. The table of all such outcomes for a decision problem is called a payoff table. Payoff Table for a Decision Problem with K Possible Actions and H Possible States of Nature (Table 19.1) ACTIONS ai STATES OF NATURE s1 s2 ... sH M11 M21 . . . MK1 M12 M22 . . . MK2 ... ... . . . ... M1H M2H . . . MKH si a1 a2 . . . aK Decision Rule Based on Maximin Criterion Suppose that a decision maker has to choose from K admissible actions a1, a2, . . ., aK , given H possible states of nature s1, s2, . . ., sH. Let Mij denote the payoff corresponding to the ith action and jth state of nature. For each action, seek the smallest possible payoff. For action a1, for example, this is the smallest of M11, M12, . . .M1H . Let us denote the minimum M1* where M1* Min (M11, M12 ,, M1H ) More generally, the smallest possible payoff for action ai is given by * M i ( M 11, M 12 ,, M 1H ) The maximin criterion then selects the action ai for which the corresponding Mi* is largest (that is, the action for which the minimum payoff is highest). Regret or Opportunity Loss Table Suppose that a payoff table is arranged as a rectangular array, with rows corresponding to actions and columns to states of nature. If each payoff in the table is subtracted from the largest payoff in its column, the resulting array is called a regret table, or opportunity loss table. Decision Rule Based on the Minimax Criterion Given the regret table, the action dictated by the minimax regret criterion is found as follows: (i) For each row (action), find the maximum regret. (ii) Choose the action corresponding to the minimum of these maximum regrets. The minimax criterion selects the action for which the maximum regret is smallest; that is, the minimax regret criterion produces the smallest possible opportunity loss that can be guaranteed. Payoff s with State-of-Nature Probabilities (Table 19.6) ACTIONS ai si a1 a2 . . . aK STATES OF NATURE s1 (1) s2 (2) ... sH (H) M11 M21 . . . MK1 M12 M22 . . . MK2 ... ... . . . ... M1H M2H . . . MKH Expected Monetary Value (EMV) Criterion Suppose that a decision maker has K possible actions, a1, a2, . . ., aK and is faced with H states of nature. Let Mij denote the payoff corresponding to the ith action and jth state and j the H probability of occurrence of the jth state of nature with j 1. j 1 The expected monetary value of action ai, EMV(ai) , is H EMV (a i ) 1M i1 2 M i 2 H M iH j M ij j 1 The Expected Monetary Value Criterion adopts the action with the largest expected monetary value; that is, given a choice among alternation actions, the EMV criterion dictates the choice of the action for which the EMV is highest. Decision Trees The tree diagram is a graphical device that forces the decision-maker to examine all possible outcomes, including unfavorable ones. Decision Trees All decision trees contain: Decision (or action) nodes Event (or state-of-nature) nodes Terminal nodes Decision Trees (Figure 19.3) Actions Process A Process B Process C States of nature Bayes’ Theorem Let s1, s2, . . ., sH be H mutually exclusive and collectively exhaustive events, corresponding to the H states of nature of a decision problem. Let A be some other event. Denote the conditional probability that si will occur, given that A occurs, by P(si|A), and the probability of A, given si, by P(A|si). Bayes’ Theorem states that the conditional probability of si, given A, can be expressed as P( A | si ) P( si ) P( si | A) P( A) P( A | si ) P( si ) P( si | A) P( A | s1 ) P( s1 ) P( A | s2 ) P( s2 ) P( A | sH ) P( sH ) In the terminology of this section, P(si) is the prior probability of si and is modified to the posterior probability, P(si|A), given the sample information that event A has occurred. Expected Value of Perfect Information, EVPI Suppose that a decision maker has to choose from among K possible actions, in the face of H states of nature, s1, s2, . . ., sH. Perfect information corresponds to knowledge of which state of nature will arise. The expected value of perfect information is obtained as follows: (i) Determine which action will be chosen if only the prior probabilities P(s1), P(s2), . . ., P(sH) are used. (ii) For each possible state of nature, si, find the difference, Wi, between the payoff for the best choice of action, if it were known that state would arise, and the payoff for the action chosen if only prior probabilities are used. This is the value of perfect information, when it is known that si will occur. (iii) The expected value of perfect information, EVPI, is then EVPI P(s1 )W1 P(s2 )W2 P(sH )WH Expected Value of Sample Information, EVSI Suppose that a decision maker has to choose from among K possible actions, in the face of H states of nature, s1, s2, . . ., sH. The decision-maker may obtain sample information. Let there be M possible sample results, A1, A2, . . . , AM. The expected value of sample information is obtained as follows: (i) Determine which action will be chosen if only the prior probabilities were used. (ii) Determine the probabilities of obtaining each sample result: P( Ai ) P( Ai | s1 ) P(s1 ) P( Ai | s2 ) P(s2 ) P( Ai | sH ) P(sH ) Expected Value of Sample Information, EVSI (continued) (iii) For each possible sample result, Ai, find the difference, Vi, between the expected monetary value for the optimal action and that for the action chosen if only the prior probabilities are used. This is the value of the sample information, given that Ai was observed. (iv) The expected value of sample information, EVSI, is then: EVPI P(s1 )W1 P(s2 )W2 P(sH )WH Obtaining a Utility Function Suppose that a decision maker may receive several alternative payoffs. The transformation from payoffs to utilities is obtained as follows: (i) The units in which utility is measured are arbitrary. Accordingly, a scale can be fixed in any convenient fashion. Let L be the lowest and H the highest of all the payoffs. Assign utility 0 to payoff L and utility 100 to payoff H. (ii) Let I be any payoff between L and H. Determine the probability such that the decision-maker is indifferent between the following alternatives: (a) Receive payoff I with certainty (b) Receive payoff H with probability and payoff L with probability (1 - ) (iii) The utility to the decision-maker of payoff I is then 100. The curve relating utility to payoff is called a utility function. Utility Functions; (a) Risk Aversion; (b) Preference for Risk; (c) Indifference to Risk Utility Payoff (a) Risk aversion Utility Utility (Figure 19.11) Payoff (b) Preference for risk Payoff (c) Indifference to risk The Expected Utility Criterion Suppose that a decision maker has K possible actions, a1, a2, . . ., aK and is faced with H states of nature. Let Uij denote the utility corresponding to the ith action and jth state and j the probability of occurrence of the jth state of nature. Then the expected utility, EU(ai), of the action ai is H EU (a i ) 1U i1 2U i 2 HU iH jU ij j 1 Given a choice between alternative actions, the expected utility criterion dictates the choice of the action for which expected utility is highest. Under generally reasonable assumptions, it can be shown that the rational decision-maker should adopt this criterion. If the decision-maker is indifferent to risk, the expected utility criterion and expected monetary value criterion are equivalent. Key Words Action Admissible action Aversion to Risk Bayes’ Theorem Decision nodes Decision Trees EMV Event nodes EVPI EVSI Expected Monetary Value Criterion Expected net Value of Sample Information Expected Utility Criterion Inadmissible Action Indifference to Risk Maximin Criterion Minimax Regret Criterion Opportunity Loss Table Payoff Table Perfect Information Key Words (continued) Preference for Risk Regret Table Sensitivity Analysis States of Nature Terminal Nodes Tree Plan Utility Function Value of Perfect Information Value of Sample Information