Download Efficiency estimator

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
EFFICIENT ESTIMATOR AND
LIMIT OF EXPERIMENT
BY: ERIC IGABE
14.02.2015
Efficient estimator and limit of experiment
1
Outline
 Introduction
 Efficiency estimator
 Locally asymptotical normality
 Convolution theorem
 Likelihood ratio process
 Asymptotic representation theorem
 Asymptotic Normality
 Uniform distribution
 Pareto distribution
 Asymptotic mixed normality
14.02.2015
Efficient estimator and limit of experiment
2
Introduction
 Mathematical formation of asymptotic efficiency in estimator theory from
point view of concentration of estimators around the true parameter value
 We need some information:
•
Fisher’s information
•
Concentration probability
•
Regular estimator
•
Locally asymptotically minimax
14.02.2015
Efficient estimator and limit of experiment
3
Cramer-Rao Inequality Theorem
Let 𝑋 = (𝑋1 , … , 𝑋𝑛 )𝑇 𝑤ℎ𝑒𝑟𝑒 𝑋1 , … 𝑋𝑛 are 𝑖. 𝑖. 𝑑 with
ℒ θ ∈ 𝑃θ,𝑛 : θ ∈ Θ , Θ ⊂ ℝ. Assume that 𝑃θ,𝑛 has a density 𝑓θ which is
continuous differentiable w.r.t θ for almost all 𝑋. Let 𝑇𝑛 be an estimator
with bias 𝑏𝑛 (θ). Where the derivative 𝑏𝑛′ (θ) w.r.t θ exists , then
𝑴𝑺𝑬𝜽 𝑻𝒏 ≥
14.02.2015
𝒃′𝒏 𝜽 +𝟏
𝟐
𝑰 𝜽
Efficient estimator and limit of experiment
4
Efficiency estimator
Let 𝑷𝜽 : 𝜽 ∈ 𝜣 is a parametric model and 𝑿 = (𝑿𝟏 , … , 𝑿𝒏 ) are the data
sampled from this model
Let 𝑻𝒏 = 𝑻𝒏 (𝑿) be an estimator for the parameter θ
If estimator 𝑇𝑛 is unbiased i.e., 𝔼 𝐓𝐧 𝐗
= 𝛉, then the Cramér–Rao
inequality states the variance of this estimator is bounded from below:
𝑽𝒂𝒓𝜽 𝑻𝒏 ≥
𝟏
𝑰(𝜽)
Where 𝐈 𝛉 is Fisher information matrix of the model at point θ
14.02.2015
Efficient estimator and limit of experiment
5
Note
Under certain restriction the normal distribution with mean zero and the
covariance the Fisher information is an asymptotic Lower bound for estimating
𝜓 θ = θ in smooth parametric model
The statement : 𝒏(𝑻𝒏 − 𝜓(𝜽))
𝜽
𝓝(𝝁𝜽), 𝝈𝟐 (𝜽))
This implies 𝑇𝑛 is approximately normally distributed mean and variance
given by: 𝝍(𝜽) +
14.02.2015
𝝁(𝜽)
𝒏
and
𝝈𝟐 (𝜽)
𝒏
Efficient estimator and limit of experiment
6
Note
The concentration of a normal limit distribution 𝐿θ cannot be measured by
mean and variance alone. Instead, we can employ a variety of concentration
measure such as:

𝒙𝟐 𝒅𝑳𝜽 (𝒙) second moment

𝒙 𝒅𝑳𝜽 (𝒙) first moment

𝟏

𝒙 >𝒂
𝒅𝑳𝜽 (𝒙) probability
𝒙 ∧ 𝒂 𝒅𝑳𝜽 (𝒙)
 A limit distribution is “good” if quantities of this type are “small”
14.02.2015
Efficient estimator and limit of experiment
7
Loss functions
“A loss function” is nonnegative function ℓ(𝒕, 𝜽) which is increasing in
distance between the estimates value 𝒕 and 𝜽
and its integral 𝐑 𝐭, 𝜽 = 𝔼[ℓ(𝒕, 𝜽)] =
ℓ𝒅𝑳𝜽 is called “asymptotic risk of
the estimator”
The method of measuring concentration by means of loss function applied to
one and higher dimensional parameters alike
14.02.2015
Efficient estimator and limit of experiment
8
Hodge’s estimator
Suppose 𝑇𝑛 is a “common estimator” for some parameter θ: it is consistent,
and converges to some asymptotic distribution 𝐿θ usually this is 𝒩(0, 𝜎 2 )
which may depend on θ at the 𝑛 -rate: i.e., 𝒏(𝑻𝒏 − 𝜽)
Then the Hodges’ estimator 𝑺𝒏 =
𝑻𝒏 𝒊𝒇 𝑻𝒏 ≥ 𝒏
14.02.2015
Efficient estimator and limit of experiment
𝑳𝜽 ,
𝟏
−𝟒
−
𝟎 𝒊𝒇 𝑻𝒏 < 𝒏
𝜽
𝟏
𝟒
9
Hodge’s estimator
It is not difficult to see that 𝑺𝒏 is consistent for θ, and its asymptotic distribution
is:
𝜶
 𝒏

𝑺𝒏 − 𝜽
𝒏(𝑺𝒏 − 𝜽)
14.02.2015
𝜽
𝜽
𝟎 when θ = 0 for some 𝛼 ∈ ℝ
𝑳𝜽 when θ ≠ 0
Efficient estimator and limit of experiment
10
Quadratic risk function of the
Hedge’s estimator
14.02.2015
Efficient estimator and limit of experiment
11
Quadratic risk function of the
Hedge’s estimator
 This graph is based on mean of sample of size 𝑛 = 5 (𝑻𝒉𝒆 𝒃𝒍𝒖𝒆 𝒄𝒖𝒓𝒗𝒆) ,
𝑛 = 50 (𝒑𝒖𝒓𝒑𝒍𝒆 𝒄𝒖𝒓𝒗𝒆 ) and 𝑛 = 500 (𝒐𝒍𝒊𝒗𝒆 𝒄𝒖𝒓𝒗𝒆 ) observations
from the 𝒩 0,1 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
 As 𝑛
∞, the locations and width of the peaks converge to zero but their
heights to infinity
 It has better asymptotic behavior at θ at the erratic behavior closed to zero
14.02.2015
Efficient estimator and limit of experiment
12
Relative efficiency
Let 𝑻𝒏𝝑,𝟏 and 𝑻𝒏𝝑,𝟐 be the estimator sequences for parameter θ based on the
sample sizes 𝒏𝝑,𝟏 and 𝒏𝝑,𝟐 respectively, such that:
𝑴𝑺𝑬 𝑻𝒏𝝑,𝟏 = 𝑴𝑺𝑬(𝑻𝒏𝝑,𝟐 )
if 𝜶 = 𝐥𝐢𝐦
𝒏𝝑,𝟐
𝝑 ∞ 𝒏𝝑,𝟏
exists, then it is called “relative efficiency” of the
estimators. In general it depends on the parameter θ
14.02.2015
Efficient estimator and limit of experiment
13
Relative efficiency
As the time index 𝜗
as
𝝑
(𝑻𝒏𝝑
𝒏𝝑
− 𝝍(𝜽))
∞, then 𝝑(𝑻𝒏𝝑 − 𝜓(𝜽))
𝜽
𝓝(𝟎, 𝝈𝟐 (𝜽)), as 𝑛𝜗
𝜽
𝓝(𝟎, 𝟏) can be written
∞
Thus the relative efficiency of two estimator sequences with asymptotic
variances 𝜎𝑖2 (θ) is just
𝒏𝝑,𝟐
𝜶 = 𝐥𝐢𝐦
𝝑 ∞
𝝑
𝒏𝝑,𝟏
𝝑
=
𝝈𝟐𝟐 (𝜽)
𝝈𝟐𝟏 (𝜽)
If the value of 𝜶 > 𝟏, then the second estimator sequence needs
proportionally that many observations more than the first to achieve the same
asymptotic
14.02.2015
Efficient estimator and limit of experiment
14
Locally asymptotic normality
Suppose X = (𝑋1 , … , 𝑋𝑛 ) be sample from 𝓛 𝜽 ∈ {𝑷𝜽, : 𝜽 ∈ 𝜣}, on some
measurable space (𝝌, 𝓐)
The fully observation is a single observation from the product 𝑷𝒏𝜽 of n copies
of 𝑷𝜽
Statistical model is completely described as the collection of a probability
measures {𝑷𝒏𝜽 : 𝜽 ∈ 𝜣 ⊂ ℝ𝒌 } on the sample space (𝝌𝒏 , 𝓐𝒏 )
14.02.2015
Efficient estimator and limit of experiment
15
Locally asymptotic normality
The parametrization is concentered around a fixed parameter θ0 , then we
should define:
“local parameter” 𝒉 = 𝒏(𝜽 − 𝜽𝟎 )
Therefore we rewrite 𝑷𝒏𝜽 as 𝑷𝒏
𝜽𝟎 +
𝒉
𝒏
Thus we obtain an experiment with the parameter ℎ
14.02.2015
Efficient estimator and limit of experiment
16
Definition
A sequence of parametric statistical models 𝑷𝜽,𝒏 : 𝜽 ∈ 𝜣 , 𝜣 ⊂ ℝ𝒌 is said to
be locally asymptotically normal (LAN) at θ if:
𝒑
𝐥𝐨𝐠
𝒏
𝒊=𝟏
𝜽+
𝒉
𝒏
𝒑𝜽
Where ∆𝒏,𝜽 =
𝟏
𝒏
𝟏
𝟐
𝑿𝒊 = 𝒉∆𝒏,𝜽 − 𝑰𝜽 𝒉𝟐 + 𝒐𝒑𝜽 (𝟏)
𝒏
𝒊=𝟏 ℓ𝜽 (𝑿𝒊 )
is asymptotically normal with mean zero and
variance 𝐼θ , by the “central limit theorem”
𝟏
𝟐
The second term in the expansion is asymptotically equivalent to − 𝒉𝟐 𝑰𝜽 by
the “law of large numbers”
14.02.2015
Efficient estimator and limit of experiment
17
Note
For large 𝑛, the experiments:
(𝑷𝒏
𝜽𝟎 +
𝒉
𝒏
𝒌
: 𝒉 ∈ ℝ𝒌 ) and 𝓝(𝒉, 𝑰−𝟏
𝜽𝟎 : 𝒉 ∈ ℝ ) are similar in statistical
properties, whenever the original experiment θ
14.02.2015
𝑃θ are smooth in parameter
Efficient estimator and limit of experiment
18
Assumption
𝟏
𝟐
Let ( 𝒑𝜽+𝒉 − 𝒑𝜽 − 𝒉𝑻 ℓ𝜽 𝒑𝜽 )𝟐 𝒅𝝁 = 𝒐( 𝒉 𝟐 ), 𝒉
𝟎
(𝑨𝟏 )
If this condition is satisfied, then the model (𝑷𝜽 : 𝜽 ∈ 𝜣) is differentiable
quadratic mean at 𝜽
𝟏
𝟐
Usually, 𝒉𝑻 ℓ𝜽 (𝒙) 𝒑𝜽 (𝒙) is the derivative of the map 𝒉
𝒑𝜽+𝒉 (𝒙) at ℎ
0
for (almost) every 𝑥
In this case “score function” ℓ𝜽 (𝒙) = 𝟐
14.02.2015
𝟏
Ə
𝒑𝜽 (𝒙) Ə𝜽
Efficient estimator and limit of experiment
𝒑𝜽 (𝒙) =
Ə
𝐥𝐨𝐠 𝒑𝜽 (𝒙)
Ə𝜽
19
Theorem1
Assume that the experiment (𝑃θ : θ ∈ Θ) is differentiable in quadratic mean
at the point θ and let 𝑇𝑛 be statistics in the experiments (𝑃𝑛
θ+
ℎ
𝑛
: ℎ ∈ ℝ𝑘 ) such
that the sequence 𝑇𝑛 converges in distribution under every ℎ. There exists a
randomized statistic 𝑇 in the experiment (𝒩 ℎ, 𝐼θ−1 : ℎ ∈ ℝ𝑘 ) such that
𝑇𝑛
ℎ
𝑇 for every ℎ
14.02.2015
Efficient estimator and limit of experiment
20
Lower bound for experiments
Let’s consider parameters θ +
ℎ
𝑛
for θ fixed and ℎ ranging over ℝ𝑘 and
suppose that, for certain limit distribution 𝐿θ,ℎ
𝒏(𝑻𝒏 − 𝜓(𝜽 +
𝒉
))
𝒏
𝜽+
𝒉
𝒏
𝑳𝜽,𝒉 , for every ℎ
(∗)
then 𝑇𝑛 can be consider a good estimator for 𝜓(𝜽) if the limit distribution
𝑳𝜽,𝒉 are maximally concentrated near zero
If they are maximally concentrated for every ℎ and some fixed θ, then 𝑻𝒏 can
be considered locally optimal at 𝜽
14.02.2015
Efficient estimator and limit of experiment
21
Lower bound for experiments
Suppose that the observations are a sample of size 𝑛 from a distribution 𝑷𝜽
If 𝑃θ depends smooth on the parameter, then
(𝑷𝒏
𝜽+
𝒉
𝒏
: 𝒉 ∈ ℝ𝒌 )
14.02.2015
𝜽+
𝒉
𝒏
𝒌
(𝓝 𝒉, 𝑰−𝟏
𝜽 :𝒉 ∈ ℝ )
Efficient estimator and limit of experiment
22
Theorem 2
Assume that (𝑨𝟏 ) holds at the point θ and let 𝜓 be differentiable at θ and 𝑇𝑛
be estimator in the experiments (𝑃𝑛
θ+
ℎ
𝑛
: ℎ ∈ ℝ𝑘 ) such that (6) holds for every
ℎ. Then there exists a randomized statistic 𝑇 in the experiment
(𝒩 ℎ, 𝐼θ−1 : ℎ ∈ ℝ𝑘 ) such that:
𝑇 − 𝜓θ ℎ has distribution 𝐿θ,ℎ for every ℎ
14.02.2015
Efficient estimator and limit of experiment
23
Ideal of the Proof
Apply theorem 1 to:
𝑆𝑛 = 𝑛(𝑇𝑛 − 𝜓(θ)) in view of the definition of 𝐿θ,ℎ and the definition the
differentiability of 𝜓, the sequence
𝑺𝒏 = 𝒏 𝑻𝒏 − 𝜓 𝜽 +
𝒉
𝒏
+ 𝒏(𝜓 𝜽 +
𝒉
𝒏
− 𝜓(𝜽))
Converges in distribution under ℎ to 𝐿θ,ℎ ∗ 𝛿𝜓θ ℎ
Where ∗ 𝛿ℎ denotes a translation by ℎ.
According to theorem 1, there exists a randomized statistic 𝑇 in normal
experiment such that 𝑇 has distribution 𝐿θ,ℎ ∗ 𝛿𝜓θ ℎ for every ℎ.This satisfies
the requirements ∎
14.02.2015
Efficient estimator and limit of experiment
24
Gaussian models
Consider a single observation 𝑋 from 𝒩 ℎ,
estimate 𝐴ℎ from a given matrix 𝐴.Where
distribution, it is required to
is nonsingular covariance matrix
A randomized estimator is called equivalent -in -law for estimating 𝐴ℎ if the
distribution of 𝑻 − 𝑨𝒉 under ℎ does not depends on ℎ
14.02.2015
Efficient estimator and limit of experiment
25
Proposition 3
The null distribution 𝐿 of any randomized equivariant in law estimator of 𝐴ℎ
can be decomposed as 𝑳 = 𝓝 𝑨 𝑨𝑻 ∗ 𝑴 for some probability measure 𝑀.
The only randomized equivalent -in-law estimator for which 𝑀 is degenerate
at 0 is 𝐴𝑋
14.02.2015
Efficient estimator and limit of experiment
26
Definition
Let ℓ be any loss function with values in [0, ∞), then a function is called bowl
shaped if the sets {𝒙: ℓ((𝒙) ≤ 𝒄} are convex and symmetric about the origin,
it is called subconvex if, moreover, these sets are closed
October 18th ,2006
Efficient estimator and limit of experiment
27
Lemma (Anderson’s lemma)
For any bowl-shaped loss function ℓ on ℝ𝑘 ,every probability measure 𝑀 on
ℝ𝑘 , and every covariance matrix
ℓ𝒅𝑵(𝟎, ) ≤
ℓ𝒅[𝑵 𝟎,
∗ 𝑴]
Next consideration the minimax criterion. According to this criterion the best
estimator, relative to given loss function, minimized the maximum risk
𝑠𝑢𝑝ℎ 𝔼ℎ ℓ(𝑇 − 𝐴ℎ) over all randomized estimators 𝑇. For every bowl-shaped
loss function ℓ this leads again to the estimator 𝐴𝑋
14.02.2015
Efficient estimator and limit of experiment
28
Proposition 4
For any bowl-shaped loss function ℓ, the maximum risk of any randomized
estimator 𝑇 of 𝐴ℎ is bounded below by 𝔼0 ℓ 𝐴𝑋 . Consequently, 𝐴𝑋 is
minimax estimator for 𝐴ℎ.If 𝐴ℎ is real and 𝔼𝟎 𝑨𝑿 𝟐 ℓ(𝑨𝑿) < ∞, then 𝐴𝑋 is
the only minimax estimator for 𝐴ℎ up to change on sets of probability zero
14.02.2015
Efficient estimator and limit of experiment
29
Convolution theorem
An estimator sequence 𝑇𝑛 is called regular at θ for estimating a parameter 𝜓(θ)
if, for every ℎ , 𝒏(𝑻𝒏 − 𝜓(𝜽 +
𝒉
))
𝒏
𝜽+
𝒉
𝒏
𝑳𝜽
The probability measure 𝐿θ may be arbitrary but should be the same for every ℎ
In term of limit distribution 𝐿θ,ℎ in (∗), regularity is exact that all 𝐿θ,ℎ are equal,
for the given θ
According to theorem 1, every estimator sequence is matched by an estimator 𝑇
in the limit experiment (𝒩 ℎ, 𝐼θ−1 : ℎ ∈ ℝ𝑘 )
14.02.2015
Efficient estimator and limit of experiment
30
Convolution theorem
A regular estimator sequence has the following property
𝑻 − 𝝍𝜽 𝒉 ∼ 𝑳𝜽 for every 𝒉
(𝑷𝟏 )
Thus it is matched by an equivalent-in-law estimator for 𝜓θ ℎ
14.02.2015
Efficient estimator and limit of experiment
31
Theorem 5 (Convolution theorem )
Assume (𝑨𝟏 ) holds at the point. Let 𝜓 be differentiable at θ and 𝑇𝑛 be an at
θ regular estimator sequence in the experiments (𝑷𝒏𝜽 : 𝜽 ∈ 𝜣) with limit
distribution 𝐿θ . Then there exists a probability measure 𝑀θ such that:
𝑻
𝑳𝜽 = 𝑵 𝟎, 𝜓𝜽 𝑰−𝟏
𝜓
𝜽
𝜽 ∗ 𝑴𝜽
In particular, if 𝐿θ has covariance matrix
θ , then the matrix
−𝟏 𝑻
−
𝜓
𝑰
𝜽
𝜽 𝜽 𝜓𝜽
is nonnegative definite
Idea of proof:
Apply theorem 2 to conclude that 𝐿θ is the distribution of an equivalent-inlaw estimator 𝑇 in limit experiment satisfying (𝑷𝟏 ) next apply Proposition 3
∎
14.02.2015
Efficient estimator and limit of experiment
32
Almost-everywhere convolution
theorem
Hodge’s example shows that there is no hope for a nontrivial lower bound for
the limit distribution of a standardized estimator sequences 𝒏(𝑻𝒏 − 𝝍(𝜽)) for
every θ
It is always possible to improve on estimator sequence for selected parameters
14.02.2015
Efficient estimator and limit of experiment
33
Theorem 6
Assume (𝑨𝟏 ) holds at the point θ. Let 𝜓 be differentiable at every θ and
𝑇𝑛 be an estimator sequence in the experiment (𝑃θ𝑛 : θ ∈ Θ) such that
𝑛(𝑇𝑛 − 𝜓(θ)) converges to a limit distribution 𝐿θ under every θ. Then
there exist probability distribution 𝑀θ such that for Lebesgue almost every θ
𝑻
𝑳𝜽 = 𝑵 𝟎, 𝝍𝜽 𝑰−𝟏
𝜽 𝝍𝜽 ∗ 𝑴𝜽
In particular, if 𝐿θ has covariance matrix
θ
θ,
then the matrix
− 𝜓θ 𝐼θ−1 𝜓θ𝑇 is nonnegative definite Lebesgue almost every θ
14.02.2015
Efficient estimator and limit of experiment
34
Lemma 7
Let 𝑇𝑛 be an estimators in experiment (𝑃𝑛,𝜃 𝜃 ∈ 𝛩) indexed by a measurable
subset 𝛩 of ℝ𝑘 . Assume that the map θ
every 𝑛, and that the mapθ
𝑃𝑛,θ 𝐴 is measurable set 𝐴 and
𝜓(θ)is measurable. Suppose that there exist
distributions 𝐿θ such that for Lebesgue almost every θ
𝒓𝒏 (𝑻𝒏 − 𝝍(𝜽))
𝜽
Then for every 𝛾𝑛
𝑳𝜽 .
0 there exists a subsequence of 𝑛 such that, for
Lebesgue almost every θ, ℎ , along the subsequence,
𝒓𝒏 (𝑻𝒏 − 𝝍 𝜽 + 𝜸𝒏 𝒉 )
14.02.2015
𝜽+𝜸𝒏 𝒉
𝑳𝜽
Efficient estimator and limit of experiment
35
Local asymptotic minimax
theorem
The normal 𝒩(0, 𝜓θ 𝐼θ−1 𝜓θ𝑇 ) distribution is the best possible limit
If it is based on the minimax criterion and gives a lower bound for the
maximum risk over a small neighborhood of θ
In fact, it bounds the expression
𝐥𝐢𝐦 𝐥𝐢𝐦 𝒊𝒏𝒇 𝒔𝒖𝒑⎸⎸𝜽′ −𝜽⎸⎸ 𝔼𝜽′ 𝓵( 𝒏(𝑻𝒏 − 𝝍(𝜽′ )))
𝜹 𝟎𝒏 ∞
This is the asymptotic maximum risk over an arbitrarily small neighborhood
of θ
14.02.2015
Efficient estimator and limit of experiment
36
Theorem 8
Assume (𝑨𝟏 ) holds at θ with nonsingular Fisher information matrix 𝐼θ . Let 𝜓
be differentiable at θ. Let 𝑇𝑛 be any estimator sequence in the experiments
𝑃θ𝑛 : θ ∈ ℝ𝑘 . Then for any bowl-shaped loss function ℓ
𝒔𝒖𝒑𝑰 𝐥𝐢𝐦 𝒊𝒏𝒇 𝒔𝒖𝒑𝒉∈𝑰 𝔼
𝜽+
𝒏 ∞
≥
𝒉𝓵
𝒏
𝒉
𝒏 𝑻𝒏 − 𝝍 𝜽 +
𝒏
𝑻
𝓵𝒅𝓝(𝟎, 𝝍𝜽 𝑰−𝟏
𝜽 Ѱ𝜽 )
Here the first supremum is taken over all finite subset 𝐼 of ℝ𝑘
14.02.2015
Efficient estimator and limit of experiment
37
Likelihood ratio process(LRP)
For fixed parameter 𝒉𝟎 ∈ 𝑯, the likelihood ratio process (LRP) with base ℎ0 is
formed as :
𝒅𝑷𝒉
(𝑿))𝒉∈𝑯 ≡
𝒅𝑷𝒉𝟎
(
𝑷𝒉
)
𝑷𝒉𝟎 𝒉∈𝑯
(
Each LRP is a (typically infinite-dimensional) vector of random variables
𝒅𝑷𝒉
(𝑿)
𝒅𝑷𝒉𝟎
14.02.2015
Efficient estimator and limit of experiment
38
Definition
A sequence 𝜺𝒏 = (𝝌𝒏 , 𝓐𝒏 , 𝑷𝒏,𝒉 : 𝒉 ∈ 𝑯) of the experiment converges to a limit
experiment 𝜺 = (𝝌, 𝓐, 𝑷𝒉 : 𝒉 ∈ 𝑯) if, for every finite subset 𝐼 ⊂ 𝐻 and every
ℎ0 ∈ 𝐻,
(
𝒅𝑷𝒏,𝒉
𝒅𝑷𝒏,𝒉𝟎
)𝒉∈𝑰
14.02.2015
𝒉𝟎
𝒅𝑷𝒉
)
𝒅𝑷𝒉𝒐 𝒉∈𝑰
(
Efficient estimator and limit of experiment
39
Example (equivalence by sufficiency )
Let 𝑺: 𝝌
𝜰 be a statistic in statistical experiment (𝝌, 𝓐, 𝑷𝒉 : 𝒉 ∈ 𝑯) with
values in the measurable space (𝜰, 𝓑)
The experiment of image laws (𝛶, 𝓑, 𝑃ℎ ∘ 𝑆 −1 : ℎ ∈ 𝐻) corresponds to
observing 𝑆
If 𝑆 is sufficient statistic, then this experiment is equivalent to the original
experiment (𝝌, 𝓐, 𝑷𝒉 : 𝒉 ∈ 𝑯)
14.02.2015
Efficient estimator and limit of experiment
40
Asymptotic representation
theorem
A randomized test or test function 𝑇 in experiment (𝝌, 𝓐, 𝑷𝒉 : 𝒉 ∈ 𝑯) is
measurable map 𝑻: 𝝌 × [𝟎, 𝟏]
ℝ𝒌 on the sample space
The interpretation is that if 𝑥 is observed, then a null hypothesis is rejected
with probability 𝑇(𝑥)
The power function of test 𝑇 is function ℎ
14.02.2015
𝔼ℎ (𝑇(𝑥))
Efficient estimator and limit of experiment
41
Theorem 9
Let 𝜀𝑛 = (𝜒𝑛 , 𝒜𝑛 , 𝑃𝑛,ℎ : ℎ ∈ 𝐻) be a sequence of experiment that converges to
a dominated experiment 𝜀 = (𝜒, 𝒜, 𝑃ℎ : ℎ ∈ 𝐻). Let 𝑇𝑛 be a sequence of
statistics in 𝜀𝑛 that converges in distribution for every ℎ. Then there exists a
randomized statistic 𝑇 in 𝜀 such that 𝑇𝑛
14.02.2015
ℎ
𝑇 for every ℎ
Efficient estimator and limit of experiment
42
Asymptotic Normality
A sequence of statistical models (𝑃𝑛,θ : θ ∈ Θ) indexed by an open subset
Θ ⊂ ℝ𝑘 is defined to be locally asymptotically normal at θ if:
the log likelihood ratios log
𝑑𝑃𝑛,θ+𝑟−1 ℎ
𝑛
𝑛
𝑑𝑃𝑛,θ
allows a certain quadratic expansion
This is shown to be valid in the case that 𝑃𝑛,θ is the distribution of a sample
size 𝑛 from a smooth parametric model
Such experiments converge to simple normal limit experiments if they are
reparametrized in terms of the local parameter ℎ
14.02.2015
Efficient estimator and limit of experiment
43
Theorem 10
Let 𝜀𝑛 = (𝑃𝑛,ℎ : ℎ ∈ 𝐻) be sequence of experiments indexed by a subset 𝐻 of
ℝ𝑑 such that log
𝑑𝑃𝑛,ℎ
𝑑𝑃𝑛,0
1
2
= ℎ𝑇 ∆𝑛 − ℎ𝑇 𝐽ℎ + 𝑜𝑃𝑛,0 (1), for a sequence of statistics
∆𝑛 that converges weakly under ℎ = 0 to a 𝒩 0, 𝐽 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛. Then the
sequence 𝜀𝑛 converges to the experiment (𝒩 𝐽ℎ, 𝐽 : ℎ ∈ 𝐻)
14.02.2015
Efficient estimator and limit of experiment
44
Corollary 11
Let Θ be an open subset of ℝ𝑑 , and the sequence of statistical models
(𝑃𝑛,θ : θ ∈ Θ) be locally asymptotically normal at θ with norming matrices
𝑟𝑛 and a nonsingular matrix 𝐼θ . Then the sequence of experiment
(𝑃𝑛,𝑟𝑛−1ℎ : ℎ ∈ ℝ𝑑 ) converges to the experiment (𝒩 ℎ, 𝐼θ−1 : ℎ ∈ 𝐻)
14.02.2015
Efficient estimator and limit of experiment
45
Uniform distribution
The family of the uniform distribution on [0, θ] is nowhere differentiable in
quadratic mean
The reason is that the support of the uniform distribution depend too much on
the parameter
Differentiability in quadratic mean (𝑨𝟏 ) does not require that all density 𝑝θ
have the same support
14.02.2015
Efficient estimator and limit of experiment
46
Theorem 12
Let 𝑃θ𝑛 be the distribution of a random sample of size 𝑛 from a uniform
𝑛
distribution on [0, θ]. Then the sequence of experiment (𝑃θ−ℎ
: ℎ ∈ ℝ)
converges for each fixed θ > 0 to the experiment consisting of observing
one observation from the shifted exponential density 𝑧
(𝑧−ℎ)
𝑒− θ
𝟙 {𝑧>ℎ}
θ
(define 𝑃θ for θ < 0)
Proof on board:
14.02.2015
Efficient estimator and limit of experiment
47
Corollary 13
Let 𝑇𝑛 be estimator based on a sample 𝑋1 , … , 𝑋𝑛 from the uniform distribution
on [0, θ] such that the sequence 𝑛(𝑇𝑛 − θ) converges under θ in distribution to
a limit 𝐿θ , for every θ. Then for Lebesgue almost every θ we have
⎸𝑥 ⎸𝑑𝐿θ (𝑥) ≥ 𝔼[𝑧 − 𝑚𝑒𝑑 𝑧] and 𝑥 2 𝑑𝐿θ (𝑥) ≥ 𝔼[𝑧 − 𝔼(𝑧)]2 for the
random variable 𝑧 exponentially distribution with mean θ
14.02.2015
Efficient estimator and limit of experiment
48
Pareto distribution
The Pareto distributions are two parameter family of distributions on the real
line with parameters 𝛼 > 0 and 𝜇 > 0 and density 𝑥
𝛼𝜇𝛼
𝟙{𝑥
𝑥 𝛼+1
> 𝜇}
The likelihood ratio for sample size 𝑛 from the Pareto distribution with
ℎ ) and (𝛼 + 𝑔0
,
𝜇
+
𝑛
𝑛
parameters (𝛼 + 𝑔
ℎ0
,
𝜇
+
𝑛), respectively, is
𝑛
equal to:
𝜶+𝒈 𝒏
𝒈
𝜶+ 𝟎 𝒏
14.02.2015
𝒏
𝝁+𝒉 𝒏
𝒉
𝝁+ 𝟎
𝒏𝜶+ 𝒏𝒈
𝒏𝜶+ 𝒏𝒈𝟎
𝒏
(
(𝒈𝒐 −𝒈)
𝒏
𝒊=𝟏 𝑿𝒊 )
𝒏
𝟏{𝑿
𝒉
𝟏
Efficient estimator and limit of experiment
>𝝁+𝒏}
49
Pareto distribution
𝜶+
𝒏
𝒈
𝒏
𝒈
𝜶+ 𝟎 𝒏
= 𝒆𝒙𝒑
𝝁+𝒉 𝒏
𝒉
𝝁+ 𝟎
𝒏𝜶+ 𝒏𝒈
𝒏𝜶+ 𝒏𝒈𝟎
(
(𝒈𝒐 −𝒈)
𝒏
𝒊=𝟏 𝑿𝒊 )
𝒏
𝒈 − 𝒈𝟎 ∆𝒏 −
𝟏 𝒈𝟐 +𝒈𝟐𝟎
𝟐 𝜶𝟐
+𝒐 𝟏
Here under the parameter (𝛼 + 𝑔0
∆𝒏 = −
14.02.2015
𝟏
𝒏
𝑿𝒊
𝒏
(𝐥𝐨𝐠
𝒊=𝟏
𝝁
𝒆
𝒏
𝟏{𝑿
𝒉
𝟏
𝒉−𝒉𝟎 𝜶
>𝝁+𝒏}
𝝁
+𝒐 𝟏
𝟏{𝒁𝒏 >𝒉}
ℎ0
,
𝜇
+
𝑛), the sequence
𝑛
𝟏
𝜶
− ) converges weakly to 𝒩(𝑔0
Efficient estimator and limit of experiment
𝛼2 ,
1
𝛼2)
50
Pareto distribution
The sequence 𝒁𝒏 = 𝒏(𝑿
𝟏
− 𝝁) converges in distribution to the
exponential distribution with mean 𝜇
𝜇
2
and
variance
(
𝛼)
𝛼+ℎ0
The two sequences are asymptotically independent
Thus the likelihood is a product of a locally asymptotically normally and a
“locally asymptotically exponential” factor
The local limit experiment consists of observation a pair (∆, 𝑍) of independent
variables ∆ and 𝑍 with a 𝒩(𝑔, 𝛼 2 ) distribution and an exponential
exp
𝛼
𝜇
14.02.2015
+ ℎ distribution, respectively
Efficient estimator and limit of experiment
51
Asymptotic mixed normality
A sequence of experiment (𝑃𝑛,θ : θ ∈ Θ) indexed by an open subset of ℝ𝑘 is
called “locally asymptotically mixed normal” at θ if there exist matrices
𝛾𝑛,θ
log
0
such that
𝑑𝑃𝑛,θ+𝛾𝑛 ,θℎ𝑛
14.02.2015
𝑑𝑃𝑛,θ
1
2
= ℎ𝑇 ∆𝑛,θ − ℎ𝑇 𝐽𝑛,θ ℎ + 𝑜𝑃𝑛,θ (1)
Efficient estimator and limit of experiment
r
52
Asymptotic mixed normality
For every converging sequence ℎ𝑛
matrices 𝐽𝑛,θ such that (∆𝑛,θ , 𝐽𝑛,θ )
ℎ, and random vector ∆𝑛,θ and random
θ
(∆θ , 𝐽θ ) for a random vector such that the
conditional distribution of ∆θ given that 𝐽θ = 𝐽 is normal 𝒩(0, 𝐽).
Locally asymptotically mixed normal is often abbreviated to LAMN. Locally
asymptotically normal, is the special case in which the matrix 𝐽θ is
deterministic.
14.02.2015
Efficient estimator and limit of experiment
53
Theorem 14
Assume that the sequence of the experiments (𝑃𝑛,θ : θ ∈ Θ) is locally
asymptotically mixed normal at θ. Then the sequence of experiments
(𝑃𝑛,θ+𝛾𝑛 ℎ : ℎ ∈ ℝ𝑘 ) converges to the experiment consisting of observing a pair
(∆, 𝐽) such that 𝐽 is marginally distributed as 𝐽θ for every ℎ and the conditional
distribution of ∆ given 𝐽 is normal 𝒩(𝐽ℎ, 𝐽)
14.02.2015
Efficient estimator and limit of experiment
54
THANK YOU!
14.02.2015
Efficient estimator and limit of experiment
55