Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
EFFICIENT ESTIMATOR AND LIMIT OF EXPERIMENT BY: ERIC IGABE 14.02.2015 Efficient estimator and limit of experiment 1 Outline Introduction Efficiency estimator Locally asymptotical normality Convolution theorem Likelihood ratio process Asymptotic representation theorem Asymptotic Normality Uniform distribution Pareto distribution Asymptotic mixed normality 14.02.2015 Efficient estimator and limit of experiment 2 Introduction Mathematical formation of asymptotic efficiency in estimator theory from point view of concentration of estimators around the true parameter value We need some information: • Fisher’s information • Concentration probability • Regular estimator • Locally asymptotically minimax 14.02.2015 Efficient estimator and limit of experiment 3 Cramer-Rao Inequality Theorem Let 𝑋 = (𝑋1 , … , 𝑋𝑛 )𝑇 𝑤ℎ𝑒𝑟𝑒 𝑋1 , … 𝑋𝑛 are 𝑖. 𝑖. 𝑑 with ℒ θ ∈ 𝑃θ,𝑛 : θ ∈ Θ , Θ ⊂ ℝ. Assume that 𝑃θ,𝑛 has a density 𝑓θ which is continuous differentiable w.r.t θ for almost all 𝑋. Let 𝑇𝑛 be an estimator with bias 𝑏𝑛 (θ). Where the derivative 𝑏𝑛′ (θ) w.r.t θ exists , then 𝑴𝑺𝑬𝜽 𝑻𝒏 ≥ 14.02.2015 𝒃′𝒏 𝜽 +𝟏 𝟐 𝑰 𝜽 Efficient estimator and limit of experiment 4 Efficiency estimator Let 𝑷𝜽 : 𝜽 ∈ 𝜣 is a parametric model and 𝑿 = (𝑿𝟏 , … , 𝑿𝒏 ) are the data sampled from this model Let 𝑻𝒏 = 𝑻𝒏 (𝑿) be an estimator for the parameter θ If estimator 𝑇𝑛 is unbiased i.e., 𝔼 𝐓𝐧 𝐗 = 𝛉, then the Cramér–Rao inequality states the variance of this estimator is bounded from below: 𝑽𝒂𝒓𝜽 𝑻𝒏 ≥ 𝟏 𝑰(𝜽) Where 𝐈 𝛉 is Fisher information matrix of the model at point θ 14.02.2015 Efficient estimator and limit of experiment 5 Note Under certain restriction the normal distribution with mean zero and the covariance the Fisher information is an asymptotic Lower bound for estimating 𝜓 θ = θ in smooth parametric model The statement : 𝒏(𝑻𝒏 − 𝜓(𝜽)) 𝜽 𝓝(𝝁𝜽), 𝝈𝟐 (𝜽)) This implies 𝑇𝑛 is approximately normally distributed mean and variance given by: 𝝍(𝜽) + 14.02.2015 𝝁(𝜽) 𝒏 and 𝝈𝟐 (𝜽) 𝒏 Efficient estimator and limit of experiment 6 Note The concentration of a normal limit distribution 𝐿θ cannot be measured by mean and variance alone. Instead, we can employ a variety of concentration measure such as: 𝒙𝟐 𝒅𝑳𝜽 (𝒙) second moment 𝒙 𝒅𝑳𝜽 (𝒙) first moment 𝟏 𝒙 >𝒂 𝒅𝑳𝜽 (𝒙) probability 𝒙 ∧ 𝒂 𝒅𝑳𝜽 (𝒙) A limit distribution is “good” if quantities of this type are “small” 14.02.2015 Efficient estimator and limit of experiment 7 Loss functions “A loss function” is nonnegative function ℓ(𝒕, 𝜽) which is increasing in distance between the estimates value 𝒕 and 𝜽 and its integral 𝐑 𝐭, 𝜽 = 𝔼[ℓ(𝒕, 𝜽)] = ℓ𝒅𝑳𝜽 is called “asymptotic risk of the estimator” The method of measuring concentration by means of loss function applied to one and higher dimensional parameters alike 14.02.2015 Efficient estimator and limit of experiment 8 Hodge’s estimator Suppose 𝑇𝑛 is a “common estimator” for some parameter θ: it is consistent, and converges to some asymptotic distribution 𝐿θ usually this is 𝒩(0, 𝜎 2 ) which may depend on θ at the 𝑛 -rate: i.e., 𝒏(𝑻𝒏 − 𝜽) Then the Hodges’ estimator 𝑺𝒏 = 𝑻𝒏 𝒊𝒇 𝑻𝒏 ≥ 𝒏 14.02.2015 Efficient estimator and limit of experiment 𝑳𝜽 , 𝟏 −𝟒 − 𝟎 𝒊𝒇 𝑻𝒏 < 𝒏 𝜽 𝟏 𝟒 9 Hodge’s estimator It is not difficult to see that 𝑺𝒏 is consistent for θ, and its asymptotic distribution is: 𝜶 𝒏 𝑺𝒏 − 𝜽 𝒏(𝑺𝒏 − 𝜽) 14.02.2015 𝜽 𝜽 𝟎 when θ = 0 for some 𝛼 ∈ ℝ 𝑳𝜽 when θ ≠ 0 Efficient estimator and limit of experiment 10 Quadratic risk function of the Hedge’s estimator 14.02.2015 Efficient estimator and limit of experiment 11 Quadratic risk function of the Hedge’s estimator This graph is based on mean of sample of size 𝑛 = 5 (𝑻𝒉𝒆 𝒃𝒍𝒖𝒆 𝒄𝒖𝒓𝒗𝒆) , 𝑛 = 50 (𝒑𝒖𝒓𝒑𝒍𝒆 𝒄𝒖𝒓𝒗𝒆 ) and 𝑛 = 500 (𝒐𝒍𝒊𝒗𝒆 𝒄𝒖𝒓𝒗𝒆 ) observations from the 𝒩 0,1 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 As 𝑛 ∞, the locations and width of the peaks converge to zero but their heights to infinity It has better asymptotic behavior at θ at the erratic behavior closed to zero 14.02.2015 Efficient estimator and limit of experiment 12 Relative efficiency Let 𝑻𝒏𝝑,𝟏 and 𝑻𝒏𝝑,𝟐 be the estimator sequences for parameter θ based on the sample sizes 𝒏𝝑,𝟏 and 𝒏𝝑,𝟐 respectively, such that: 𝑴𝑺𝑬 𝑻𝒏𝝑,𝟏 = 𝑴𝑺𝑬(𝑻𝒏𝝑,𝟐 ) if 𝜶 = 𝐥𝐢𝐦 𝒏𝝑,𝟐 𝝑 ∞ 𝒏𝝑,𝟏 exists, then it is called “relative efficiency” of the estimators. In general it depends on the parameter θ 14.02.2015 Efficient estimator and limit of experiment 13 Relative efficiency As the time index 𝜗 as 𝝑 (𝑻𝒏𝝑 𝒏𝝑 − 𝝍(𝜽)) ∞, then 𝝑(𝑻𝒏𝝑 − 𝜓(𝜽)) 𝜽 𝓝(𝟎, 𝝈𝟐 (𝜽)), as 𝑛𝜗 𝜽 𝓝(𝟎, 𝟏) can be written ∞ Thus the relative efficiency of two estimator sequences with asymptotic variances 𝜎𝑖2 (θ) is just 𝒏𝝑,𝟐 𝜶 = 𝐥𝐢𝐦 𝝑 ∞ 𝝑 𝒏𝝑,𝟏 𝝑 = 𝝈𝟐𝟐 (𝜽) 𝝈𝟐𝟏 (𝜽) If the value of 𝜶 > 𝟏, then the second estimator sequence needs proportionally that many observations more than the first to achieve the same asymptotic 14.02.2015 Efficient estimator and limit of experiment 14 Locally asymptotic normality Suppose X = (𝑋1 , … , 𝑋𝑛 ) be sample from 𝓛 𝜽 ∈ {𝑷𝜽, : 𝜽 ∈ 𝜣}, on some measurable space (𝝌, 𝓐) The fully observation is a single observation from the product 𝑷𝒏𝜽 of n copies of 𝑷𝜽 Statistical model is completely described as the collection of a probability measures {𝑷𝒏𝜽 : 𝜽 ∈ 𝜣 ⊂ ℝ𝒌 } on the sample space (𝝌𝒏 , 𝓐𝒏 ) 14.02.2015 Efficient estimator and limit of experiment 15 Locally asymptotic normality The parametrization is concentered around a fixed parameter θ0 , then we should define: “local parameter” 𝒉 = 𝒏(𝜽 − 𝜽𝟎 ) Therefore we rewrite 𝑷𝒏𝜽 as 𝑷𝒏 𝜽𝟎 + 𝒉 𝒏 Thus we obtain an experiment with the parameter ℎ 14.02.2015 Efficient estimator and limit of experiment 16 Definition A sequence of parametric statistical models 𝑷𝜽,𝒏 : 𝜽 ∈ 𝜣 , 𝜣 ⊂ ℝ𝒌 is said to be locally asymptotically normal (LAN) at θ if: 𝒑 𝐥𝐨𝐠 𝒏 𝒊=𝟏 𝜽+ 𝒉 𝒏 𝒑𝜽 Where ∆𝒏,𝜽 = 𝟏 𝒏 𝟏 𝟐 𝑿𝒊 = 𝒉∆𝒏,𝜽 − 𝑰𝜽 𝒉𝟐 + 𝒐𝒑𝜽 (𝟏) 𝒏 𝒊=𝟏 ℓ𝜽 (𝑿𝒊 ) is asymptotically normal with mean zero and variance 𝐼θ , by the “central limit theorem” 𝟏 𝟐 The second term in the expansion is asymptotically equivalent to − 𝒉𝟐 𝑰𝜽 by the “law of large numbers” 14.02.2015 Efficient estimator and limit of experiment 17 Note For large 𝑛, the experiments: (𝑷𝒏 𝜽𝟎 + 𝒉 𝒏 𝒌 : 𝒉 ∈ ℝ𝒌 ) and 𝓝(𝒉, 𝑰−𝟏 𝜽𝟎 : 𝒉 ∈ ℝ ) are similar in statistical properties, whenever the original experiment θ 14.02.2015 𝑃θ are smooth in parameter Efficient estimator and limit of experiment 18 Assumption 𝟏 𝟐 Let ( 𝒑𝜽+𝒉 − 𝒑𝜽 − 𝒉𝑻 ℓ𝜽 𝒑𝜽 )𝟐 𝒅𝝁 = 𝒐( 𝒉 𝟐 ), 𝒉 𝟎 (𝑨𝟏 ) If this condition is satisfied, then the model (𝑷𝜽 : 𝜽 ∈ 𝜣) is differentiable quadratic mean at 𝜽 𝟏 𝟐 Usually, 𝒉𝑻 ℓ𝜽 (𝒙) 𝒑𝜽 (𝒙) is the derivative of the map 𝒉 𝒑𝜽+𝒉 (𝒙) at ℎ 0 for (almost) every 𝑥 In this case “score function” ℓ𝜽 (𝒙) = 𝟐 14.02.2015 𝟏 Ə 𝒑𝜽 (𝒙) Ə𝜽 Efficient estimator and limit of experiment 𝒑𝜽 (𝒙) = Ə 𝐥𝐨𝐠 𝒑𝜽 (𝒙) Ə𝜽 19 Theorem1 Assume that the experiment (𝑃θ : θ ∈ Θ) is differentiable in quadratic mean at the point θ and let 𝑇𝑛 be statistics in the experiments (𝑃𝑛 θ+ ℎ 𝑛 : ℎ ∈ ℝ𝑘 ) such that the sequence 𝑇𝑛 converges in distribution under every ℎ. There exists a randomized statistic 𝑇 in the experiment (𝒩 ℎ, 𝐼θ−1 : ℎ ∈ ℝ𝑘 ) such that 𝑇𝑛 ℎ 𝑇 for every ℎ 14.02.2015 Efficient estimator and limit of experiment 20 Lower bound for experiments Let’s consider parameters θ + ℎ 𝑛 for θ fixed and ℎ ranging over ℝ𝑘 and suppose that, for certain limit distribution 𝐿θ,ℎ 𝒏(𝑻𝒏 − 𝜓(𝜽 + 𝒉 )) 𝒏 𝜽+ 𝒉 𝒏 𝑳𝜽,𝒉 , for every ℎ (∗) then 𝑇𝑛 can be consider a good estimator for 𝜓(𝜽) if the limit distribution 𝑳𝜽,𝒉 are maximally concentrated near zero If they are maximally concentrated for every ℎ and some fixed θ, then 𝑻𝒏 can be considered locally optimal at 𝜽 14.02.2015 Efficient estimator and limit of experiment 21 Lower bound for experiments Suppose that the observations are a sample of size 𝑛 from a distribution 𝑷𝜽 If 𝑃θ depends smooth on the parameter, then (𝑷𝒏 𝜽+ 𝒉 𝒏 : 𝒉 ∈ ℝ𝒌 ) 14.02.2015 𝜽+ 𝒉 𝒏 𝒌 (𝓝 𝒉, 𝑰−𝟏 𝜽 :𝒉 ∈ ℝ ) Efficient estimator and limit of experiment 22 Theorem 2 Assume that (𝑨𝟏 ) holds at the point θ and let 𝜓 be differentiable at θ and 𝑇𝑛 be estimator in the experiments (𝑃𝑛 θ+ ℎ 𝑛 : ℎ ∈ ℝ𝑘 ) such that (6) holds for every ℎ. Then there exists a randomized statistic 𝑇 in the experiment (𝒩 ℎ, 𝐼θ−1 : ℎ ∈ ℝ𝑘 ) such that: 𝑇 − 𝜓θ ℎ has distribution 𝐿θ,ℎ for every ℎ 14.02.2015 Efficient estimator and limit of experiment 23 Ideal of the Proof Apply theorem 1 to: 𝑆𝑛 = 𝑛(𝑇𝑛 − 𝜓(θ)) in view of the definition of 𝐿θ,ℎ and the definition the differentiability of 𝜓, the sequence 𝑺𝒏 = 𝒏 𝑻𝒏 − 𝜓 𝜽 + 𝒉 𝒏 + 𝒏(𝜓 𝜽 + 𝒉 𝒏 − 𝜓(𝜽)) Converges in distribution under ℎ to 𝐿θ,ℎ ∗ 𝛿𝜓θ ℎ Where ∗ 𝛿ℎ denotes a translation by ℎ. According to theorem 1, there exists a randomized statistic 𝑇 in normal experiment such that 𝑇 has distribution 𝐿θ,ℎ ∗ 𝛿𝜓θ ℎ for every ℎ.This satisfies the requirements ∎ 14.02.2015 Efficient estimator and limit of experiment 24 Gaussian models Consider a single observation 𝑋 from 𝒩 ℎ, estimate 𝐴ℎ from a given matrix 𝐴.Where distribution, it is required to is nonsingular covariance matrix A randomized estimator is called equivalent -in -law for estimating 𝐴ℎ if the distribution of 𝑻 − 𝑨𝒉 under ℎ does not depends on ℎ 14.02.2015 Efficient estimator and limit of experiment 25 Proposition 3 The null distribution 𝐿 of any randomized equivariant in law estimator of 𝐴ℎ can be decomposed as 𝑳 = 𝓝 𝑨 𝑨𝑻 ∗ 𝑴 for some probability measure 𝑀. The only randomized equivalent -in-law estimator for which 𝑀 is degenerate at 0 is 𝐴𝑋 14.02.2015 Efficient estimator and limit of experiment 26 Definition Let ℓ be any loss function with values in [0, ∞), then a function is called bowl shaped if the sets {𝒙: ℓ((𝒙) ≤ 𝒄} are convex and symmetric about the origin, it is called subconvex if, moreover, these sets are closed October 18th ,2006 Efficient estimator and limit of experiment 27 Lemma (Anderson’s lemma) For any bowl-shaped loss function ℓ on ℝ𝑘 ,every probability measure 𝑀 on ℝ𝑘 , and every covariance matrix ℓ𝒅𝑵(𝟎, ) ≤ ℓ𝒅[𝑵 𝟎, ∗ 𝑴] Next consideration the minimax criterion. According to this criterion the best estimator, relative to given loss function, minimized the maximum risk 𝑠𝑢𝑝ℎ 𝔼ℎ ℓ(𝑇 − 𝐴ℎ) over all randomized estimators 𝑇. For every bowl-shaped loss function ℓ this leads again to the estimator 𝐴𝑋 14.02.2015 Efficient estimator and limit of experiment 28 Proposition 4 For any bowl-shaped loss function ℓ, the maximum risk of any randomized estimator 𝑇 of 𝐴ℎ is bounded below by 𝔼0 ℓ 𝐴𝑋 . Consequently, 𝐴𝑋 is minimax estimator for 𝐴ℎ.If 𝐴ℎ is real and 𝔼𝟎 𝑨𝑿 𝟐 ℓ(𝑨𝑿) < ∞, then 𝐴𝑋 is the only minimax estimator for 𝐴ℎ up to change on sets of probability zero 14.02.2015 Efficient estimator and limit of experiment 29 Convolution theorem An estimator sequence 𝑇𝑛 is called regular at θ for estimating a parameter 𝜓(θ) if, for every ℎ , 𝒏(𝑻𝒏 − 𝜓(𝜽 + 𝒉 )) 𝒏 𝜽+ 𝒉 𝒏 𝑳𝜽 The probability measure 𝐿θ may be arbitrary but should be the same for every ℎ In term of limit distribution 𝐿θ,ℎ in (∗), regularity is exact that all 𝐿θ,ℎ are equal, for the given θ According to theorem 1, every estimator sequence is matched by an estimator 𝑇 in the limit experiment (𝒩 ℎ, 𝐼θ−1 : ℎ ∈ ℝ𝑘 ) 14.02.2015 Efficient estimator and limit of experiment 30 Convolution theorem A regular estimator sequence has the following property 𝑻 − 𝝍𝜽 𝒉 ∼ 𝑳𝜽 for every 𝒉 (𝑷𝟏 ) Thus it is matched by an equivalent-in-law estimator for 𝜓θ ℎ 14.02.2015 Efficient estimator and limit of experiment 31 Theorem 5 (Convolution theorem ) Assume (𝑨𝟏 ) holds at the point. Let 𝜓 be differentiable at θ and 𝑇𝑛 be an at θ regular estimator sequence in the experiments (𝑷𝒏𝜽 : 𝜽 ∈ 𝜣) with limit distribution 𝐿θ . Then there exists a probability measure 𝑀θ such that: 𝑻 𝑳𝜽 = 𝑵 𝟎, 𝜓𝜽 𝑰−𝟏 𝜓 𝜽 𝜽 ∗ 𝑴𝜽 In particular, if 𝐿θ has covariance matrix θ , then the matrix −𝟏 𝑻 − 𝜓 𝑰 𝜽 𝜽 𝜽 𝜓𝜽 is nonnegative definite Idea of proof: Apply theorem 2 to conclude that 𝐿θ is the distribution of an equivalent-inlaw estimator 𝑇 in limit experiment satisfying (𝑷𝟏 ) next apply Proposition 3 ∎ 14.02.2015 Efficient estimator and limit of experiment 32 Almost-everywhere convolution theorem Hodge’s example shows that there is no hope for a nontrivial lower bound for the limit distribution of a standardized estimator sequences 𝒏(𝑻𝒏 − 𝝍(𝜽)) for every θ It is always possible to improve on estimator sequence for selected parameters 14.02.2015 Efficient estimator and limit of experiment 33 Theorem 6 Assume (𝑨𝟏 ) holds at the point θ. Let 𝜓 be differentiable at every θ and 𝑇𝑛 be an estimator sequence in the experiment (𝑃θ𝑛 : θ ∈ Θ) such that 𝑛(𝑇𝑛 − 𝜓(θ)) converges to a limit distribution 𝐿θ under every θ. Then there exist probability distribution 𝑀θ such that for Lebesgue almost every θ 𝑻 𝑳𝜽 = 𝑵 𝟎, 𝝍𝜽 𝑰−𝟏 𝜽 𝝍𝜽 ∗ 𝑴𝜽 In particular, if 𝐿θ has covariance matrix θ θ, then the matrix − 𝜓θ 𝐼θ−1 𝜓θ𝑇 is nonnegative definite Lebesgue almost every θ 14.02.2015 Efficient estimator and limit of experiment 34 Lemma 7 Let 𝑇𝑛 be an estimators in experiment (𝑃𝑛,𝜃 𝜃 ∈ 𝛩) indexed by a measurable subset 𝛩 of ℝ𝑘 . Assume that the map θ every 𝑛, and that the mapθ 𝑃𝑛,θ 𝐴 is measurable set 𝐴 and 𝜓(θ)is measurable. Suppose that there exist distributions 𝐿θ such that for Lebesgue almost every θ 𝒓𝒏 (𝑻𝒏 − 𝝍(𝜽)) 𝜽 Then for every 𝛾𝑛 𝑳𝜽 . 0 there exists a subsequence of 𝑛 such that, for Lebesgue almost every θ, ℎ , along the subsequence, 𝒓𝒏 (𝑻𝒏 − 𝝍 𝜽 + 𝜸𝒏 𝒉 ) 14.02.2015 𝜽+𝜸𝒏 𝒉 𝑳𝜽 Efficient estimator and limit of experiment 35 Local asymptotic minimax theorem The normal 𝒩(0, 𝜓θ 𝐼θ−1 𝜓θ𝑇 ) distribution is the best possible limit If it is based on the minimax criterion and gives a lower bound for the maximum risk over a small neighborhood of θ In fact, it bounds the expression 𝐥𝐢𝐦 𝐥𝐢𝐦 𝒊𝒏𝒇 𝒔𝒖𝒑⎸⎸𝜽′ −𝜽⎸⎸ 𝔼𝜽′ 𝓵( 𝒏(𝑻𝒏 − 𝝍(𝜽′ ))) 𝜹 𝟎𝒏 ∞ This is the asymptotic maximum risk over an arbitrarily small neighborhood of θ 14.02.2015 Efficient estimator and limit of experiment 36 Theorem 8 Assume (𝑨𝟏 ) holds at θ with nonsingular Fisher information matrix 𝐼θ . Let 𝜓 be differentiable at θ. Let 𝑇𝑛 be any estimator sequence in the experiments 𝑃θ𝑛 : θ ∈ ℝ𝑘 . Then for any bowl-shaped loss function ℓ 𝒔𝒖𝒑𝑰 𝐥𝐢𝐦 𝒊𝒏𝒇 𝒔𝒖𝒑𝒉∈𝑰 𝔼 𝜽+ 𝒏 ∞ ≥ 𝒉𝓵 𝒏 𝒉 𝒏 𝑻𝒏 − 𝝍 𝜽 + 𝒏 𝑻 𝓵𝒅𝓝(𝟎, 𝝍𝜽 𝑰−𝟏 𝜽 Ѱ𝜽 ) Here the first supremum is taken over all finite subset 𝐼 of ℝ𝑘 14.02.2015 Efficient estimator and limit of experiment 37 Likelihood ratio process(LRP) For fixed parameter 𝒉𝟎 ∈ 𝑯, the likelihood ratio process (LRP) with base ℎ0 is formed as : 𝒅𝑷𝒉 (𝑿))𝒉∈𝑯 ≡ 𝒅𝑷𝒉𝟎 ( 𝑷𝒉 ) 𝑷𝒉𝟎 𝒉∈𝑯 ( Each LRP is a (typically infinite-dimensional) vector of random variables 𝒅𝑷𝒉 (𝑿) 𝒅𝑷𝒉𝟎 14.02.2015 Efficient estimator and limit of experiment 38 Definition A sequence 𝜺𝒏 = (𝝌𝒏 , 𝓐𝒏 , 𝑷𝒏,𝒉 : 𝒉 ∈ 𝑯) of the experiment converges to a limit experiment 𝜺 = (𝝌, 𝓐, 𝑷𝒉 : 𝒉 ∈ 𝑯) if, for every finite subset 𝐼 ⊂ 𝐻 and every ℎ0 ∈ 𝐻, ( 𝒅𝑷𝒏,𝒉 𝒅𝑷𝒏,𝒉𝟎 )𝒉∈𝑰 14.02.2015 𝒉𝟎 𝒅𝑷𝒉 ) 𝒅𝑷𝒉𝒐 𝒉∈𝑰 ( Efficient estimator and limit of experiment 39 Example (equivalence by sufficiency ) Let 𝑺: 𝝌 𝜰 be a statistic in statistical experiment (𝝌, 𝓐, 𝑷𝒉 : 𝒉 ∈ 𝑯) with values in the measurable space (𝜰, 𝓑) The experiment of image laws (𝛶, 𝓑, 𝑃ℎ ∘ 𝑆 −1 : ℎ ∈ 𝐻) corresponds to observing 𝑆 If 𝑆 is sufficient statistic, then this experiment is equivalent to the original experiment (𝝌, 𝓐, 𝑷𝒉 : 𝒉 ∈ 𝑯) 14.02.2015 Efficient estimator and limit of experiment 40 Asymptotic representation theorem A randomized test or test function 𝑇 in experiment (𝝌, 𝓐, 𝑷𝒉 : 𝒉 ∈ 𝑯) is measurable map 𝑻: 𝝌 × [𝟎, 𝟏] ℝ𝒌 on the sample space The interpretation is that if 𝑥 is observed, then a null hypothesis is rejected with probability 𝑇(𝑥) The power function of test 𝑇 is function ℎ 14.02.2015 𝔼ℎ (𝑇(𝑥)) Efficient estimator and limit of experiment 41 Theorem 9 Let 𝜀𝑛 = (𝜒𝑛 , 𝒜𝑛 , 𝑃𝑛,ℎ : ℎ ∈ 𝐻) be a sequence of experiment that converges to a dominated experiment 𝜀 = (𝜒, 𝒜, 𝑃ℎ : ℎ ∈ 𝐻). Let 𝑇𝑛 be a sequence of statistics in 𝜀𝑛 that converges in distribution for every ℎ. Then there exists a randomized statistic 𝑇 in 𝜀 such that 𝑇𝑛 14.02.2015 ℎ 𝑇 for every ℎ Efficient estimator and limit of experiment 42 Asymptotic Normality A sequence of statistical models (𝑃𝑛,θ : θ ∈ Θ) indexed by an open subset Θ ⊂ ℝ𝑘 is defined to be locally asymptotically normal at θ if: the log likelihood ratios log 𝑑𝑃𝑛,θ+𝑟−1 ℎ 𝑛 𝑛 𝑑𝑃𝑛,θ allows a certain quadratic expansion This is shown to be valid in the case that 𝑃𝑛,θ is the distribution of a sample size 𝑛 from a smooth parametric model Such experiments converge to simple normal limit experiments if they are reparametrized in terms of the local parameter ℎ 14.02.2015 Efficient estimator and limit of experiment 43 Theorem 10 Let 𝜀𝑛 = (𝑃𝑛,ℎ : ℎ ∈ 𝐻) be sequence of experiments indexed by a subset 𝐻 of ℝ𝑑 such that log 𝑑𝑃𝑛,ℎ 𝑑𝑃𝑛,0 1 2 = ℎ𝑇 ∆𝑛 − ℎ𝑇 𝐽ℎ + 𝑜𝑃𝑛,0 (1), for a sequence of statistics ∆𝑛 that converges weakly under ℎ = 0 to a 𝒩 0, 𝐽 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛. Then the sequence 𝜀𝑛 converges to the experiment (𝒩 𝐽ℎ, 𝐽 : ℎ ∈ 𝐻) 14.02.2015 Efficient estimator and limit of experiment 44 Corollary 11 Let Θ be an open subset of ℝ𝑑 , and the sequence of statistical models (𝑃𝑛,θ : θ ∈ Θ) be locally asymptotically normal at θ with norming matrices 𝑟𝑛 and a nonsingular matrix 𝐼θ . Then the sequence of experiment (𝑃𝑛,𝑟𝑛−1ℎ : ℎ ∈ ℝ𝑑 ) converges to the experiment (𝒩 ℎ, 𝐼θ−1 : ℎ ∈ 𝐻) 14.02.2015 Efficient estimator and limit of experiment 45 Uniform distribution The family of the uniform distribution on [0, θ] is nowhere differentiable in quadratic mean The reason is that the support of the uniform distribution depend too much on the parameter Differentiability in quadratic mean (𝑨𝟏 ) does not require that all density 𝑝θ have the same support 14.02.2015 Efficient estimator and limit of experiment 46 Theorem 12 Let 𝑃θ𝑛 be the distribution of a random sample of size 𝑛 from a uniform 𝑛 distribution on [0, θ]. Then the sequence of experiment (𝑃θ−ℎ : ℎ ∈ ℝ) converges for each fixed θ > 0 to the experiment consisting of observing one observation from the shifted exponential density 𝑧 (𝑧−ℎ) 𝑒− θ 𝟙 {𝑧>ℎ} θ (define 𝑃θ for θ < 0) Proof on board: 14.02.2015 Efficient estimator and limit of experiment 47 Corollary 13 Let 𝑇𝑛 be estimator based on a sample 𝑋1 , … , 𝑋𝑛 from the uniform distribution on [0, θ] such that the sequence 𝑛(𝑇𝑛 − θ) converges under θ in distribution to a limit 𝐿θ , for every θ. Then for Lebesgue almost every θ we have ⎸𝑥 ⎸𝑑𝐿θ (𝑥) ≥ 𝔼[𝑧 − 𝑚𝑒𝑑 𝑧] and 𝑥 2 𝑑𝐿θ (𝑥) ≥ 𝔼[𝑧 − 𝔼(𝑧)]2 for the random variable 𝑧 exponentially distribution with mean θ 14.02.2015 Efficient estimator and limit of experiment 48 Pareto distribution The Pareto distributions are two parameter family of distributions on the real line with parameters 𝛼 > 0 and 𝜇 > 0 and density 𝑥 𝛼𝜇𝛼 𝟙{𝑥 𝑥 𝛼+1 > 𝜇} The likelihood ratio for sample size 𝑛 from the Pareto distribution with ℎ ) and (𝛼 + 𝑔0 , 𝜇 + 𝑛 𝑛 parameters (𝛼 + 𝑔 ℎ0 , 𝜇 + 𝑛), respectively, is 𝑛 equal to: 𝜶+𝒈 𝒏 𝒈 𝜶+ 𝟎 𝒏 14.02.2015 𝒏 𝝁+𝒉 𝒏 𝒉 𝝁+ 𝟎 𝒏𝜶+ 𝒏𝒈 𝒏𝜶+ 𝒏𝒈𝟎 𝒏 ( (𝒈𝒐 −𝒈) 𝒏 𝒊=𝟏 𝑿𝒊 ) 𝒏 𝟏{𝑿 𝒉 𝟏 Efficient estimator and limit of experiment >𝝁+𝒏} 49 Pareto distribution 𝜶+ 𝒏 𝒈 𝒏 𝒈 𝜶+ 𝟎 𝒏 = 𝒆𝒙𝒑 𝝁+𝒉 𝒏 𝒉 𝝁+ 𝟎 𝒏𝜶+ 𝒏𝒈 𝒏𝜶+ 𝒏𝒈𝟎 ( (𝒈𝒐 −𝒈) 𝒏 𝒊=𝟏 𝑿𝒊 ) 𝒏 𝒈 − 𝒈𝟎 ∆𝒏 − 𝟏 𝒈𝟐 +𝒈𝟐𝟎 𝟐 𝜶𝟐 +𝒐 𝟏 Here under the parameter (𝛼 + 𝑔0 ∆𝒏 = − 14.02.2015 𝟏 𝒏 𝑿𝒊 𝒏 (𝐥𝐨𝐠 𝒊=𝟏 𝝁 𝒆 𝒏 𝟏{𝑿 𝒉 𝟏 𝒉−𝒉𝟎 𝜶 >𝝁+𝒏} 𝝁 +𝒐 𝟏 𝟏{𝒁𝒏 >𝒉} ℎ0 , 𝜇 + 𝑛), the sequence 𝑛 𝟏 𝜶 − ) converges weakly to 𝒩(𝑔0 Efficient estimator and limit of experiment 𝛼2 , 1 𝛼2) 50 Pareto distribution The sequence 𝒁𝒏 = 𝒏(𝑿 𝟏 − 𝝁) converges in distribution to the exponential distribution with mean 𝜇 𝜇 2 and variance ( 𝛼) 𝛼+ℎ0 The two sequences are asymptotically independent Thus the likelihood is a product of a locally asymptotically normally and a “locally asymptotically exponential” factor The local limit experiment consists of observation a pair (∆, 𝑍) of independent variables ∆ and 𝑍 with a 𝒩(𝑔, 𝛼 2 ) distribution and an exponential exp 𝛼 𝜇 14.02.2015 + ℎ distribution, respectively Efficient estimator and limit of experiment 51 Asymptotic mixed normality A sequence of experiment (𝑃𝑛,θ : θ ∈ Θ) indexed by an open subset of ℝ𝑘 is called “locally asymptotically mixed normal” at θ if there exist matrices 𝛾𝑛,θ log 0 such that 𝑑𝑃𝑛,θ+𝛾𝑛 ,θℎ𝑛 14.02.2015 𝑑𝑃𝑛,θ 1 2 = ℎ𝑇 ∆𝑛,θ − ℎ𝑇 𝐽𝑛,θ ℎ + 𝑜𝑃𝑛,θ (1) Efficient estimator and limit of experiment r 52 Asymptotic mixed normality For every converging sequence ℎ𝑛 matrices 𝐽𝑛,θ such that (∆𝑛,θ , 𝐽𝑛,θ ) ℎ, and random vector ∆𝑛,θ and random θ (∆θ , 𝐽θ ) for a random vector such that the conditional distribution of ∆θ given that 𝐽θ = 𝐽 is normal 𝒩(0, 𝐽). Locally asymptotically mixed normal is often abbreviated to LAMN. Locally asymptotically normal, is the special case in which the matrix 𝐽θ is deterministic. 14.02.2015 Efficient estimator and limit of experiment 53 Theorem 14 Assume that the sequence of the experiments (𝑃𝑛,θ : θ ∈ Θ) is locally asymptotically mixed normal at θ. Then the sequence of experiments (𝑃𝑛,θ+𝛾𝑛 ℎ : ℎ ∈ ℝ𝑘 ) converges to the experiment consisting of observing a pair (∆, 𝐽) such that 𝐽 is marginally distributed as 𝐽θ for every ℎ and the conditional distribution of ∆ given 𝐽 is normal 𝒩(𝐽ℎ, 𝐽) 14.02.2015 Efficient estimator and limit of experiment 54 THANK YOU! 14.02.2015 Efficient estimator and limit of experiment 55