Download Lecture 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
ELEC6111: Detection and Estimation Theory
Minimax Hypothesis Testing
In deriving Bayes decision rule, we assumed that we know both the a priori
probabilities  0 and  1 as well as the likelihoods p( y | H 0 ) and p( y | H1 ) . This
means that we have both the knowledge of the mechanism generating the state of the
nature and the mechanism affecting our observations (our measurements about the
state of nature). It is, however, possible that we may not have access to all the
information. For example, we may not know the a priori probabilities. In such a case,
the Bayes decision rule is not a good rule since it can only be derived for a given a
priori probability.
An alternative to the Byes hypothesis testing, in this case, is the Minimax
Hypothesis testing. The minimax decision rule minimizes the maximum possible risk,
i.e., it minimizes,
max R0 ( ), R1 ( )
over all  .
Let’s look at r ( 0 ,  ) , i.e., the overall risk for a decision rule  when the a priori
  [0,1] . It can be written as:
probability is 0
r ( 0 ,  )   0 R0 ( )  (1   0 ) R1 ( )
ELEC6111: Detection and Estimation Theory
Minimax Hypothesis Testing
Note that, for a given decision rule  , as  0 varies from 0 to 1, r ( 0 ,  ) goes
linearly from R1 ( )  r (0,  ) to R0 ( )  r (1,  ) . Therefore, for a given decision rule
 the maximum value of r ( 0 ,  ) as  0 varies over the interval [0, 1] occurs either at
 0  0 or  0  1 and it is max{ R0 ( ), R1 ( )} .
So minimizing max{ R0 ( ), R1 ( )} is equivalent to minimizing max r ( 0 ,  ) .
0 0 1
Thus, the minimax decision rule is:
min max r ( 0 ,  )

0 0 1
Let   0 denote the optimum (Bayes) decision rule for the a priori probability  0 .
Denote the corresponding minimum Bayes risk by V ( 0 ) , i.e., V ( 0 )  r ( 0 ,  ) .
It is easy to show that V ( 0 ) is a continuous concave function of  0 for
 0 [o, 1] and has the end points V (0)  C11 and V (1)  C00 .
ELEC6111: Detection and Estimation Theory
Minimax Hypothesis Testing
The following figure shows a typical graph of V ( 0 ) and
r ( 0 ,  ) . Let’s draw a tangent to V ( 0 ) parallel to r ( 0 ,  ) .
Denote this line by r ( 0 ,   0 ) .
ELEC6111: Detection and Estimation Theory
Minimax Hypothesis Testing
Since r ( 0 ,   0 ) lies entirely below the line r ( 0 ,  ) , it has a lower maximum
compared to r ( 0 ,  ) . Also note that since it touches V ( 0 ) at  0   0 then   0 is the
minimum risk (Bayes) rule for a priori probability  0 . Since for any  0 [0, 1] we can
draw a tangent to V ( 0 ) and find the minimum risk rule as a Bayes rule, it is clear that
the minimax decision rule is the Bayes rule for the value of  0 that maximizes V ( 0 ) .
Denoting this point by  L , we note that point,
max{ R0 (  L ), R1 (  L )}  R0 (  L )  R1 (  L )
ELEC6111: Detection and Estimation Theory
Minimax Hypothesis Testing
Proposition: The Minimax Test
Let  L be the a priori probability that maximizes V ( 0 ) and such that either  L  0 , or  L  1, or
R0 (  L )  R1 (  L ),
then   L is a minimax rule.
ELEC6111: Detection and Estimation Theory
Minimax Hypothesis Testing
Proof
Let R0 (  L )  R1 (  L ), then for any  0 we have,
max min r ( 0 ,  )  r ( L ,  L )  r ( 0 ,  L ),
0 0 1
So, we have

max min r ( 0 ,  )  max r ( 0 ,  L )  min max r ( 0 ,  ),
0 0 1

Also, for each  we have
0 0 1
max r ( 0 ,  )  max min r ( 0 ,  ).
0 0 1
This implies that,

0 0 1
0 0 1

min max r ( 0 ,  )  max min r ( 0 ,  ).

0 0 1
0 0 1

Combining the two inequalities, we get,
min max r ( 0 ,  )  max min r ( 0 ,  ) .

Therefore,
That is,   L is the minimax rule.
0 0 1
0 0 1

r ( L ,   L )  min max r ( 0 ,  ) .

0 0 1
ELEC6111: Detection and Estimation Theory
Minimax Hypothesis Testing
Discussion
By definition: V ( 0 )  r ( 0 ,  ) . So, for every  0 [0, 1] , we have r ( 0 ,   )  V ( 0 ) and
0
0
r ( 0 ,   )  V ( 0 ). Since r ( 0 ,   ) , as a function of  0 is a straight line, it has to be tangent to
0
0
V ( 0 ) at  0   0 . If V ( 0 ) is differentiable at  0 , we have,
V ( 0 )  dr ( 0 ,   0 ) / d 0  R0 (  0 )  R1 (  0 ).
Now consider the case that V ( 0 ) has an interior maximum but is not differentiable at that point. In
this case we define two decision rules  L  lim  0  L   0 and  L  lim  0  L   0 .
The critical regions for these two decision rules are,
1  { y   | (1   L )(C11  C01 ) p( y | H1 )   L (C00  C10 ) p( y | H 0 ),
and
1  { y   | (1   L )(C11  C01 ) p( y | H1 )   L (C00  C10 ) p( y | H 0 ),
~
Take a number q  [0, 1] and devise a decision rule   L that uses the decision rule  L with probability
q and uses  L with probability1  q . It means that it decides H1 if y  1 , decides H 0 if y  (1 ) c and
decides H1 with probability q if y is on the boundary of 1 .
ELEC6111: Detection and Estimation Theory
Minimax Hypothesis Testing
Discussion
~
Note that the Bayes risk is not a function of q , so r ( L ,   L )  V ( L ) but the conditional risks
depend on q ,
~
R j (  L )  qR j (L )  (1  q) R j (L ).
~
~
To achieve R0 (  L )  R1 ( L ) , we need to choose,
q
R0 ( L )  R1 ( L )
R0 ( L )  R1 ( L )  R0 ( L )  R1 ( L )
Note that V ( L )  R0 (L )  R0 (L ) , so we have:
V ( L )
q
.
V ( L )  V ( L )
This is called a randomized decision rule.
.
ELEC6111: Detection and Estimation Theory
Example: Measurement with Gaussian Error
Consider the measurement with Gaussian error with unifom costs
The function V ( 0 ) can be written as,
   0
   1
)  (1   0 )(1  Q(
)),


V ( 0 )   0Q(
With

  1
2
log( 0 )  0
.
1  0
1  0
2
We can find the rule making conditional risks R0 ( ) and R1 ( ) equal by letting,
   0
   1
Q(
)  (1  Q(
))


and solving for   .
 
ELEC6111: Detection and Estimation Theory
Example: Measurement with Gaussian Error
We can solve this by inspection and get:
 L 
 0  1
2
.
So, the minimax decision rule is:
1 if y  ( 0  1 ) / 2
0 if y  ( 0  1 ) / 2.
  ( y)  
L
Conditional risks for measurement with Gaussian error
ELEC6111: Detection and Estimation Theory
Neyman-Pearson Hypothesis Testing
In Bayes hypothesis testing as well as minimax, we are concerned with the average risk, i.e., the
conditional risk averaged over the two hypotheses. Neyman-Pearson test, on the other hand, recognizes the
asymmetry between the two hypotheses. It tries to minimize one of the two conditional risks with the other
conditional risk fixed (or bounded).
In testing the two hypotheses H 0 and H1 , the following situations may arise:
 H 0 is true but H1 is decided. This is called a type I error or a false alarm. This comes from radar



application where H 0 represents “no target” and H1 is the case of “target present”. The probability
of this event is called false alarm probability or false alarm rate and is denoted as PF ( )
H1 is true but H 0 is decided. This is called a type II error or a miss. The probability of this event is
called miss probability and is denoted as PM ( )
H 0 is true and H 0 is decided. Probability of this event is 1  PF ( ) .
H1 is true and H1 is decided. This case represents a detection. The detection probability is
PD ( )  1  PM ( ) .
In testing H 0 versus H1 , one has to tradeoff between the probabilities of two types of errors. NeymanPearson criterion makes this tradeoff by bounding the probability of false alarm and minimizing miss
probability subject to this constraint, i.e., the Neyman-Pearson test is,
max PD ( ) subject to PF ( )   ,

where  is the bound on false alarm rate. It is called the level of the test.
ELEC6111: Detection and Estimation Theory
Neyman-Pearson Hypothesis Testing
For obtaining a general solution to the Neyman-Pearson test, we need to define a randomized decision
rule. We define the randomized test,
1 if L( y)   L
~

  L ( y)  q if L( y)   L
0 if L( y)  
L

where  L is the threshold corresponding to  L .
~
While in a non-randomized rule,  ( y ) gives the decision, in a randomized rule,   L ( y ) gives the probability of
decision.
Then we have,
~
~
~
PF ( )  E0 { (Y )}    ( y ) p( y | H 0 )dy,

where E0 {.} is expectation under hypothesis H 0 . Also,
~
~
~
PD ( )  E1{ (Y )}    ( y ) p( y | H1 )dy.

ELEC6111: Detection and Estimation Theory
Neyman-Pearson Lemma
Consider a hypothesis pair H 0 and H1 :
H 0 : Y ~ P0
Versus
H1 : Y ~ P1
where Pj has density p j ( y )  p( y | H j ) for j  0, 1 . For   0 , the following statements are true:
~
~
~
1. Optimality: Let  be any decision rule satisfying PF ( )   . Let   be any decision rule of the form
if
 1

    ( y) if
 0
if

~
p( y | H1 )  p( y | H 0 )
p( y | H1 )  p( y | H 0 )
p( y | H1 )  p( y | H 0 ),
( A)
~
~
~
where   0 and 0   ( y )  1 are such that PF ( )   . Then PD ( )  PD ( ).
This means that any size-α decision rule of form (A) is Neyman-Pearson rule.
~
2. Existence: For any   (0, 1) there is a decision rule,  NP , of form (A) with  ( y )   0 for which
~
PF ( NP )   .
3. Uniqueness: Suppose that   is any Neyman-Pearson rule of size-α for H 0 versus H1 . Then   must be of
the form (A).
ELEC6111: Detection and Estimation Theory
Neyman-Pearson Lemma (Proof)
~
~
1. Not that, by definition, we always have [ ( y )   ( y )][ p( y | H1 )  p( y | H 0 )]  0 (why?)
So, we have,

~

~
[ ( y )   ( y )][ p( y | H1 )  p( y | H 0 )] dy  0.
Expanding the above expression, we get,

 ( y ) p( y | H1 )dy    ( y ) p( y | H1 )dy      ( y ) p( y | H 0 )dy    ( y ) p( y | H 0 )dy .
~
~
~
~


Applying the expressions for the detection probability and false alarm rate, we have:


~
~
~
~

~
PD ( )  PD ( )  [ PF ( )  PF ( )]  [  PF ( )]  0.
2. Let  0 be the smallest number such that (look at the Figure in next slide):
P0 [ p(Y | H1 )  0 p(Y | H 0 )]   .
Then if P0 [ p(Y | H1 )  0 p(Y | H 0 )]   , choose,
  P0 [ p(Y | H1 )  0 p(Y | H 0 )]
0 
P0 [ p(Y | H1 )   0 p(Y | H 0 )]
~
Otherwise, choose  0 arbitrarily. Consider a Neyman-Pearson decision rule,  NP , with    0 and  ( y )   0 . For this
decision rule, the false alarm arte is,
~
~
PF ( NP )  E0 { NP }  P0 [ p(Y | H1 )  0 p(Y | H 0 )]   0 P0 [ p(Y | H1 )  0 p(Y | H 0 )]   .
ELEC6111: Detection and Estimation Theory
Neyman-Pearson Lemma (Proof)
.
2. See the text.
ELEC6111: Detection and Estimation Theory
Neyman-Pearson Lemma (Example): Measurement with Gaussian Error
For this problem, we have,
  
  
)  1(
),


P0 [ p(Y | H1 )  p(Y | H 0 )]  P0 [ L( y )   )  P0 (Y   )  Q(
where     2
log(  )  0  1

.
1  0
2
Any value of  can be achieved by choosing,
0   Q 1 ( )  0   1 (1   )  0 .
Since P(Y  0 )  0 , the choice of  0 is arbitrary and we can choose  0  1 . So, we have
~
1 if y  0
 NP ( y)  
0 if y  0 .
ELEC6111: Detection and Estimation Theory
Neyman-Pearson Lemma (Example): Measurement with Gaussian Error
~
The detection probability for  NP is
~
~
0  1
  0 

1
)  Q Q 1 ( )  1
  QQ ( )  d 

 

PD ( NP )  E1{ NP (Y )}  P1 (Y  0 )  Q(