Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
' Statistics 626 12 Statistical Properties of Descriptive $ Statistics In this section we study the statistical properties (bias, variance, distribution, p-values, confidence intervals) of X̄ , R̂, ρ̂, and fˆ. 12.1 Review of Basic Statistics If we have an estimator θ̂ , calculated from a data set of length n, of a parameter θ (for example X̄n , R̂(v), ρ̂(v), θ̂(v), and fˆ(ω) are estimators calculated from a data set and are estimators of µ, R(v), ρ(v), θ(v), and f (ω)) then we can write down some properties we would like the estimator to have. Properties of Estimators 1. θ̂ is an unbiased estimator of θ if E(θ̂) = θ, that is, the average value of θ̂ over all realizations of length n is θ. θ̂ is asymptotically unbiased if limn→∞ E(θ̂) = θ. 2. θ̂ is efficient if it has smaller variance than any other unbiased estimator of θ, that is, its average squared distance from θ over all realizations of length n is less than any other unbiased estimator of θ. θ̂ is asymptotically efficient if it is asymptotically unbiased and for large n its variance is smaller than any other asymptotically & unbiased estimator. Topic 12: Statistical Properties of Descriptive Statistics c 1999 by H.J. Newton Copyright % Slide 1 ' 3. Statistics 626 θ̂ is consistent if it is asymptotically unbiased and its variance converges to zero as n → ∞, that is, from a single very long realization, θ̂ has very high probability of being very close to θ. $ Distributions, Confidence Intervals, Tests of Hypotheses 1. A random variable (such as a test statistic) X is said to be N (µ, σ 2 ) if Z P(X ≤ x) = x φ(y)dy, −∞ where f is the famous normal probability density function (pdf), that is, the bell shaped curve 2 2 1 e−(y−µ) /2σ . φ(y) = √ 2πσ 2. A random variable X is said to have the χ2 distribution with v degrees of freedom (and we write X Z P(X ≤ x) = ∼ χ2v ) if x fχ2 (y)dy, 0 where fχ2 is the χ2 pdf & fχ2 (x) = 1 x(v/2)−1 e−x/2 , v/2 Γ(v/2)2 x > 0, where Γ is the incomplete gamma function. It is easy to show that if X ∼ χ2v , then E(X) = v and Var(X) = 2v . Topic 12: Statistical Properties of Descriptive Statistics c 1999 by H.J. Newton Copyright % Slide 2 ' Statistics 626 3. A random variable X is said to have the exponential distribution with ∼ Exp(µ)) if Z x fExp (y)dy, P(X ≤ x) = mean µ (and we write X $ 0 where fExp is the exponential pdf fExp (y) = It can be shown that if X 1 −y/µ . e µ ∼ χ22 , then X/2 ∼ Exp(1). 4. We say a random variable Xn is asymptotically normal with mean a and variance bn (and write θ ∼ AN (a, bn )) if for large n, the distribution of θ̂ − a Zn = √ bn is very close to N (0, 1). In such a case, a 100(1 − α)% large sample confidence interval for a is given by Xn ± Zα/2 p bn , where Zγ is the value of a N (0, 1) random variable having area α under the Z curve to its right. & 5. We reject a null hypothesis H0 (at significance level α) based on the value t of a test statistic T if p-value = PH0 (T is at least as extreme as t) < α, Topic 12: Statistical Properties of Descriptive Statistics c 1999 by H.J. Newton Copyright % Slide 3 ' Statistics 626 where PH0 denotes finding the probability assuming the null hypothesis is true, and what is meant by “extreme” depends on the $ particular H0 being tested. For example, if we have a random sample of n observations from a N (µ, σ 2 ) population and we have H0 : µ ≤ µ0 , we have that the test statistic is Z= X̄ − µ0 √ , σ/ n and so extreme values of Z are any values greater than the value z we actually observe from our data set. Thus p-value = P(Z > z), and we reject H0 if p-value 12.2 < α. Sample Mean, Covariance, Correlation, and Spectral Density If X(1), . . . , X(n) is a realization from a covariance stationary time series X having mean µ, autocovariance function R, and spectral density function f , then under mild assumptions 1. For the sample mean X̄n , we have & X̄ ∼ AN (µ, and so Topic 12: Statistical Properties of Descriptive Statistics X̄ ± Zα/2 p f (0) ), n f (0)/n c 1999 by H.J. Newton Copyright % Slide 4 ' Statistics 626 is a 100(1 − α)% confidence interval for µ. Notice that for white $ noise this interval becomes the usual one in introductory statistics as then f (ω) = σ 2 . This also shows that if one uses the introductory statistics confidence interval, then one gets the wrong interval (too wide or too narrow) unless the time series is white noise. This gives rise to the idea of equivalent numbers of uncorrelated observations. Suppose we have a sample of size n from a covariance stationary time series having variance R(0) and a sample of size N from a W N series with the same variance R(0). Then the widths of the confidence intervals are p 2Zα/2 f (0)/n, p 2Zα/2 R(0)/N, respectively, and to have these widths be the same, we have to have N =n R(0) . f (0) Thus a non-white noise series of length n has equivalent information to a white noise series of length nR(0)/f (0). For an AR(1), for example, σ2 R(0) = , 1 − α2 & σ2 f (0) = , 1 + α2 which gives N =n 1+α , 1−α and so whether there is more information about µ in an AR(1) than in a W N series having the same variance depends upon the sign of Topic 12: Statistical Properties of Descriptive Statistics c 1999 by H.J. Newton Copyright % Slide 5 ' Statistics 626 α (positive means more information since N < n, negative means less since N > n). Since for an AR(1), ρ(1) = −α, we see that high negative lag one correlation (α > 0) means the AR series has much information while a high positive value of ρ(1) means there is very little information about µ in a data set. $ 2. For the sample autocovariance R̂(v), we have unless all the X ’s are zero, that Γ̂M = Toepl(R̂(0), R̂(1), . . . , R̂(M − 1)) is positive definite (that is, for any vector l having M elements which are not all zero, lT ΓM l > 0) which means that it is invertible so we can solve prediction normal equations and Yule Walker equations. 3. R̂(v) is biased but asymptotically unbiased, that is, E(R̂(v) − R(v)) 4. =− |v| R(v) → 0, n → ∞. n R̂(v) is asymptotically normal R̂(v) ∼ AN (R(v), Vn (v)), where & Vn (v) = = Topic 12: Statistical Properties of Descriptive Statistics ∞ 1 X 2 R (r) + R(r − v)R(r + v) n r=−∞ Z 2 1 cos2 2πvωf 2 (ω)dw n 0 c 1999 by H.J. Newton Copyright % Slide 6 ' 5. $ Statistics 626 ρ̂(v) = R̂(v)/R̂(0) is asymptotically normal ρ̂(v) ∼ AN (ρ(v), Wn (v)), where Wn (v) = " ∞ X 1 ρ2 (r) + ρ(r − v)ρ(r + v) n r=−∞ # −4ρ(v)ρ(r)ρ(r + v) + 2ρ2 (v)ρ2 (r) = 2 nR2 (0) Z 1 [cos 2πvω − ρ(v)]2 f 2 (ω)dω 0 6. The values of the sample autocorrelation function for different lags are not themselves uncorrelated, that is, Cov(ρ̂(v1 ) , + ρ(r − v2 )ρ(r + v1 ) − 2ρ(v1 )ρ(r)ρ(r + v2) − 2ρ(v2 )ρ(r)ρ(r + v1 ) + 2ρ(v2 )ρ(v1 )ρ2 (r) Z 1 2 cos 2πv1 ω − ρ(v1 ) 2 (0) nR 0 × cos 2πv2 ω − ρ(v2 ) f 2 (ω)dω = & ∞ 1 X ρ̂(v2 )) ≈ ρ(r + v2 )ρ(r + v1 ) n r=−∞ ∼ W N (σ 2 ), then for large n, the ρ̂(v)’s are independent N (0, 1/n). 7. If X Topic 12: Statistical Properties of Descriptive Statistics c 1999 by H.J. Newton Copyright % Slide 7 ' Statistics 626 8. For the sample spectral density function fˆ we have Z E(fˆ(ω)) = $ 1 Fn (ω − τ )f (τ )dτ → f (ω), 0 where the function Fn (ω) = 1 n sin πnω sin πω 2 is called the Féjer kernel. 9. For any fixed M frequencies ω1 , . . . , ωM in (0,.5), the random variables 2fˆ(ωM ) 2fˆ(ω1 ) ,..., f (ω1 ) f (ωM ) are asymptotically independent Var(fˆ(ω)) χ22 . Thus → f 2 (ω), ω ∈ (0, .5), which means that fˆ(ω) is not a consistent estimator of f (ω), in fact fˆ(ω) gets no closer to f (ω) on the average as n gets big. 10. If X & ∼ W N (σ 2 ), then the fˆ/σ 2 ’s at the natural frequencies are approximately independent exponentially distributed with mean one. Topic 12: Statistical Properties of Descriptive Statistics c 1999 by H.J. Newton Copyright % Slide 8