Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ELEC6111: Detection and Estimation Theory Minimax Hypothesis Testing In deriving Bayes decision rule, we assumed that we know both the a priori probabilities 0 and 1 as well as the likelihoods p( y | H 0 ) and p( y | H1 ) . This means that we have both the knowledge of the mechanism generating the state of the nature and the mechanism affecting our observations (our measurements about the state of nature). It is, however, possible that we may not have access to all the information. For example, we may not know the a priori probabilities. In such a case, the Bayes decision rule is not a good rule since it can only be derived for a given a priori probability. An alternative to the Byes hypothesis testing, in this case, is the Minimax Hypothesis testing. The minimax decision rule minimizes the maximum possible risk, i.e., it minimizes, max R0 ( ), R1 ( ) over all . Let’s look at r ( 0 , ) , i.e., the overall risk for a decision rule when the a priori [0,1] . It can be written as: probability is 0 r ( 0 , ) 0 R0 ( ) (1 0 ) R1 ( ) ELEC6111: Detection and Estimation Theory Minimax Hypothesis Testing Note that, for a given decision rule , as 0 varies from 0 to 1, r ( 0 , ) goes linearly from R1 ( ) r (0, ) to R0 ( ) r (1, ) . Therefore, for a given decision rule the maximum value of r ( 0 , ) as 0 varies over the interval [0, 1] occurs either at 0 0 or 0 1 and it is max{ R0 ( ), R1 ( )} . So minimizing max{ R0 ( ), R1 ( )} is equivalent to minimizing max r ( 0 , ) . 0 0 1 Thus, the minimax decision rule is: min max r ( 0 , ) 0 0 1 Let 0 denote the optimum (Bayes) decision rule for the a priori probability 0 . Denote the corresponding minimum Bayes risk by V ( 0 ) , i.e., V ( 0 ) r ( 0 , ) . It is easy to show that V ( 0 ) is a continuous concave function of 0 for 0 [o, 1] and has the end points V (0) C11 and V (1) C00 . ELEC6111: Detection and Estimation Theory Minimax Hypothesis Testing The following figure shows a typical graph of V ( 0 ) and r ( 0 , ) . Let’s draw a tangent to V ( 0 ) parallel to r ( 0 , ) . Denote this line by r ( 0 , 0 ) . ELEC6111: Detection and Estimation Theory Minimax Hypothesis Testing Since r ( 0 , 0 ) lies entirely below the line r ( 0 , ) , it has a lower maximum compared to r ( 0 , ) . Also note that since it touches V ( 0 ) at 0 0 then 0 is the minimum risk (Bayes) rule for a priori probability 0 . Since for any 0 [0, 1] we can draw a tangent to V ( 0 ) and find the minimum risk rule as a Bayes rule, it is clear that the minimax decision rule is the Bayes rule for the value of 0 that maximizes V ( 0 ) . Denoting this point by L , we note that point, max{ R0 ( L ), R1 ( L )} R0 ( L ) R1 ( L ) ELEC6111: Detection and Estimation Theory Minimax Hypothesis Testing Proposition: The Minimax Test Let L be the a priori probability that maximizes V ( 0 ) and such that either L 0 , or L 1, or R0 ( L ) R1 ( L ), then L is a minimax rule. ELEC6111: Detection and Estimation Theory Minimax Hypothesis Testing Proof Let R0 ( L ) R1 ( L ), then for any 0 we have, max min r ( 0 , ) r ( L , L ) r ( 0 , L ), 0 0 1 So, we have max min r ( 0 , ) max r ( 0 , L ) min max r ( 0 , ), 0 0 1 Also, for each we have 0 0 1 max r ( 0 , ) max min r ( 0 , ). 0 0 1 This implies that, 0 0 1 0 0 1 min max r ( 0 , ) max min r ( 0 , ). 0 0 1 0 0 1 Combining the two inequalities, we get, min max r ( 0 , ) max min r ( 0 , ) . Therefore, That is, L is the minimax rule. 0 0 1 0 0 1 r ( L , L ) min max r ( 0 , ) . 0 0 1 ELEC6111: Detection and Estimation Theory Minimax Hypothesis Testing Discussion By definition: V ( 0 ) r ( 0 , ) . So, for every 0 [0, 1] , we have r ( 0 , ) V ( 0 ) and 0 0 r ( 0 , ) V ( 0 ). Since r ( 0 , ) , as a function of 0 is a straight line, it has to be tangent to 0 0 V ( 0 ) at 0 0 . If V ( 0 ) is differentiable at 0 , we have, V ( 0 ) dr ( 0 , 0 ) / d 0 R0 ( 0 ) R1 ( 0 ). Now consider the case that V ( 0 ) has an interior maximum but is not differentiable at that point. In this case we define two decision rules L lim 0 L 0 and L lim 0 L 0 . The critical regions for these two decision rules are, 1 { y | (1 L )(C11 C01 ) p( y | H1 ) L (C00 C10 ) p( y | H 0 ), and 1 { y | (1 L )(C11 C01 ) p( y | H1 ) L (C00 C10 ) p( y | H 0 ), ~ Take a number q [0, 1] and devise a decision rule L that uses the decision rule L with probability q and uses L with probability1 q . It means that it decides H1 if y 1 , decides H 0 if y (1 ) c and decides H1 with probability q if y is on the boundary of 1 . ELEC6111: Detection and Estimation Theory Minimax Hypothesis Testing Discussion ~ Note that the Bayes risk is not a function of q , so r ( L , L ) V ( L ) but the conditional risks depend on q , ~ R j ( L ) qR j (L ) (1 q) R j (L ). ~ ~ To achieve R0 ( L ) R1 ( L ) , we need to choose, q R0 ( L ) R1 ( L ) R0 ( L ) R1 ( L ) R0 ( L ) R1 ( L ) Note that V ( L ) R0 (L ) R0 (L ) , so we have: V ( L ) q . V ( L ) V ( L ) This is called a randomized decision rule. . ELEC6111: Detection and Estimation Theory Example: Measurement with Gaussian Error Consider the measurement with Gaussian error with unifom costs The function V ( 0 ) can be written as, 0 1 ) (1 0 )(1 Q( )), V ( 0 ) 0Q( With 1 2 log( 0 ) 0 . 1 0 1 0 2 We can find the rule making conditional risks R0 ( ) and R1 ( ) equal by letting, 0 1 Q( ) (1 Q( )) and solving for . ELEC6111: Detection and Estimation Theory Example: Measurement with Gaussian Error We can solve this by inspection and get: L 0 1 2 . So, the minimax decision rule is: 1 if y ( 0 1 ) / 2 0 if y ( 0 1 ) / 2. ( y) L Conditional risks for measurement with Gaussian error ELEC6111: Detection and Estimation Theory Neyman-Pearson Hypothesis Testing In Bayes hypothesis testing as well as minimax, we are concerned with the average risk, i.e., the conditional risk averaged over the two hypotheses. Neyman-Pearson test, on the other hand, recognizes the asymmetry between the two hypotheses. It tries to minimize one of the two conditional risks with the other conditional risk fixed (or bounded). In testing the two hypotheses H 0 and H1 , the following situations may arise: H 0 is true but H1 is decided. This is called a type I error or a false alarm. This comes from radar application where H 0 represents “no target” and H1 is the case of “target present”. The probability of this event is called false alarm probability or false alarm rate and is denoted as PF ( ) H1 is true but H 0 is decided. This is called a type II error or a miss. The probability of this event is called miss probability and is denoted as PM ( ) H 0 is true and H 0 is decided. Probability of this event is 1 PF ( ) . H1 is true and H1 is decided. This case represents a detection. The detection probability is PD ( ) 1 PM ( ) . In testing H 0 versus H1 , one has to tradeoff between the probabilities of two types of errors. NeymanPearson criterion makes this tradeoff by bounding the probability of false alarm and minimizing miss probability subject to this constraint, i.e., the Neyman-Pearson test is, max PD ( ) subject to PF ( ) , where is the bound on false alarm rate. It is called the level of the test. ELEC6111: Detection and Estimation Theory Neyman-Pearson Hypothesis Testing For obtaining a general solution to the Neyman-Pearson test, we need to define a randomized decision rule. We define the randomized test, 1 if L( y) L ~ L ( y) q if L( y) L 0 if L( y) L where L is the threshold corresponding to L . ~ While in a non-randomized rule, ( y ) gives the decision, in a randomized rule, L ( y ) gives the probability of decision. Then we have, ~ ~ ~ PF ( ) E0 { (Y )} ( y ) p( y | H 0 )dy, where E0 {.} is expectation under hypothesis H 0 . Also, ~ ~ ~ PD ( ) E1{ (Y )} ( y ) p( y | H1 )dy. ELEC6111: Detection and Estimation Theory Neyman-Pearson Lemma Consider a hypothesis pair H 0 and H1 : H 0 : Y ~ P0 Versus H1 : Y ~ P1 where Pj has density p j ( y ) p( y | H j ) for j 0, 1 . For 0 , the following statements are true: ~ ~ ~ 1. Optimality: Let be any decision rule satisfying PF ( ) . Let be any decision rule of the form if 1 ( y) if 0 if ~ p( y | H1 ) p( y | H 0 ) p( y | H1 ) p( y | H 0 ) p( y | H1 ) p( y | H 0 ), ( A) ~ ~ ~ where 0 and 0 ( y ) 1 are such that PF ( ) . Then PD ( ) PD ( ). This means that any size-α decision rule of form (A) is Neyman-Pearson rule. ~ 2. Existence: For any (0, 1) there is a decision rule, NP , of form (A) with ( y ) 0 for which ~ PF ( NP ) . 3. Uniqueness: Suppose that is any Neyman-Pearson rule of size-α for H 0 versus H1 . Then must be of the form (A). ELEC6111: Detection and Estimation Theory Neyman-Pearson Lemma (Proof) ~ ~ 1. Not that, by definition, we always have [ ( y ) ( y )][ p( y | H1 ) p( y | H 0 )] 0 (why?) So, we have, ~ ~ [ ( y ) ( y )][ p( y | H1 ) p( y | H 0 )] dy 0. Expanding the above expression, we get, ( y ) p( y | H1 )dy ( y ) p( y | H1 )dy ( y ) p( y | H 0 )dy ( y ) p( y | H 0 )dy . ~ ~ ~ ~ Applying the expressions for the detection probability and false alarm rate, we have: ~ ~ ~ ~ ~ PD ( ) PD ( ) [ PF ( ) PF ( )] [ PF ( )] 0. 2. Let 0 be the smallest number such that (look at the Figure in next slide): P0 [ p(Y | H1 ) 0 p(Y | H 0 )] . Then if P0 [ p(Y | H1 ) 0 p(Y | H 0 )] , choose, P0 [ p(Y | H1 ) 0 p(Y | H 0 )] 0 P0 [ p(Y | H1 ) 0 p(Y | H 0 )] ~ Otherwise, choose 0 arbitrarily. Consider a Neyman-Pearson decision rule, NP , with 0 and ( y ) 0 . For this decision rule, the false alarm arte is, ~ ~ PF ( NP ) E0 { NP } P0 [ p(Y | H1 ) 0 p(Y | H 0 )] 0 P0 [ p(Y | H1 ) 0 p(Y | H 0 )] . ELEC6111: Detection and Estimation Theory Neyman-Pearson Lemma (Proof) . 2. See the text. ELEC6111: Detection and Estimation Theory Neyman-Pearson Lemma (Example): Measurement with Gaussian Error For this problem, we have, ) 1( ), P0 [ p(Y | H1 ) p(Y | H 0 )] P0 [ L( y ) ) P0 (Y ) Q( where 2 log( ) 0 1 . 1 0 2 Any value of can be achieved by choosing, 0 Q 1 ( ) 0 1 (1 ) 0 . Since P(Y 0 ) 0 , the choice of 0 is arbitrary and we can choose 0 1 . So, we have ~ 1 if y 0 NP ( y) 0 if y 0 . ELEC6111: Detection and Estimation Theory Neyman-Pearson Lemma (Example): Measurement with Gaussian Error ~ The detection probability for NP is ~ ~ 0 1 0 1 ) Q Q 1 ( ) 1 QQ ( ) d PD ( NP ) E1{ NP (Y )} P1 (Y 0 ) Q(