Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Delta Method D. Patterson, Dept. of Mathematical Sciences, U. of Montana Theorem (Delta Method) Suppose that Xn is a sequence of random variables such that Xn − µ d → N (0, 1) σn where µ is a constant and σn is a sequence of constants such that σn → 0. Let g be a real-valued function differentiable at µ with g 0 (µ) 6= 0. Then g(Xn ) − g(µ) d → N (0, 1). g 0 (µ)σn This result is most often applied to the sequence X n of sample means of an i.i.d. sequence of random variables with mean µ and variance σ 2 . In that case, the Central Limit Theorem gives that n1/2 (X n − µ) d → N (0, 1). σ The Delta Method can be applied to any function g of X n satisfying the conditions, noting that σn = σ/n1/2 → 0, to give the result n1/2 [g(X n ) − g(µ)] d → N (0, 1). g 0 (µ)σ In other words, for large n, ³ ´ g(X n ) ≈ N g(µ), [g 0 (µ)]2 σ 2 /n . Example: Suppose X1 , X2 , . . . are a sequence of i.i.d. random variables with mean µ and 2 variance σ 2 . What does the delta method tell us about the asymptotic distribution of X n ? Since g(x) = x2 and g 0 (x) = 2x, we have, by the delta method, 2 n1/2 (X n − µ2 ) d → N (0, 1). 2µσ 2 Hence, for large n, X n is approximately normal with mean µ2 and variance 4µ2 σ 2 /n. The proof of the delta method result is based on a Taylor series expansion of g(x) around µ: (x − µ)2 g(x) = g(µ) + g (µ)(x − µ) + g (µ) + ···, 2! 0 00 where we drop the second-order and higher higher order terms to give the approximation: g(x) ≈ g(µ) + g 0 (µ)(x − µ). By the assumptions of the delta method theorem, Xn will be close to µ with high probability for large n so the first-order Taylor Series approximation for g(Xn ) should be good for large n. Since g(Xn ) can be approximated by a linear function of Xn , then if Xn is approximately normal with mean µ and variance σn2 , g(Xn ) will be approximately normal with mean g(µ) and variance [g 0 (µ)]2 σn2 . Note: The variance approximation in the delta method is sometimes used by itself to approximate the mean and variance of a function g(X) of a random variable X. It can be used when the distribution of X is unknown but its mean and variance are known. For example, suppose X is a non-negative random variable with mean µ and variance σ 2 . Let Y = g(X) = log(X). Then E(Y ) ≈ log(µ) and Var(Y ) ≈ [g 0 (µ)]2 σ 2 = σ 2 /µ2 . The accuracy of this approximation depends on the assumption that X is “close” to µ with high probability. The Delta Method for Functions of Two Random Variables Two-Variable Taylor Series Expansion: Suppose now we have a sequence of pairs of random variables (Xn , Yn ) whose joint distribution is asymptotically bivariate normal and consider a univariate random variable W = g(X, Y ) where g is a real-valued function differentiable at (µx , µy ). A Taylor series expansion of g(x, y) about the values (µx , µy ) is given by: ¯ ¯ ∂g(x, y) ¯¯ ∂g(x, y) ¯¯ 2nd and higher g(x, y) = g(µx , µy ) + ¯ (x − µx ) + ¯ (y − µy ) + ¯ ¯ ∂x (µx ,µy ) ∂y order terms (µx ,µy )