Download 12 Statistical Properties of Descriptive Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
'
Statistics 626
12 Statistical Properties of Descriptive
$
Statistics
In this section we study the statistical properties (bias, variance,
distribution, p-values, confidence intervals) of X̄ , R̂, ρ̂, and fˆ.
12.1
Review of Basic Statistics
If we have an estimator θ̂ , calculated from a data set of length n, of a
parameter θ (for example X̄n , R̂(v), ρ̂(v), θ̂(v), and fˆ(ω) are
estimators calculated from a data set and are estimators of µ, R(v),
ρ(v), θ(v), and f (ω)) then we can write down some properties we
would like the estimator to have.
Properties of Estimators
1.
θ̂ is an unbiased estimator of θ if E(θ̂) = θ, that is, the average
value of θ̂ over all realizations of length n is θ. θ̂ is asymptotically
unbiased if limn→∞ E(θ̂) = θ.
2.
θ̂ is efficient if it has smaller variance than any other unbiased
estimator of θ, that is, its average squared distance from θ over all
realizations of length n is less than any other unbiased estimator of
θ. θ̂ is asymptotically efficient if it is asymptotically unbiased and
for large n its variance is smaller than any other asymptotically
&
unbiased estimator.
Topic 12: Statistical Properties of Descriptive Statistics
c 1999 by H.J. Newton
Copyright %
Slide 1
'
3.
Statistics 626
θ̂ is consistent if it is asymptotically unbiased and its variance
converges to zero as n → ∞, that is, from a single very long
realization, θ̂ has very high probability of being very close to θ.
$
Distributions, Confidence Intervals, Tests of Hypotheses
1. A random variable (such as a test statistic) X is said to be
N (µ, σ 2 ) if
Z
P(X
≤ x) =
x
φ(y)dy,
−∞
where f is the famous normal probability density function (pdf), that
is, the bell shaped curve
2
2
1
e−(y−µ) /2σ .
φ(y) = √
2πσ
2. A random variable X is said to have the χ2 distribution with v
degrees of freedom (and we write X
Z
P(X
≤ x) =
∼ χ2v ) if
x
fχ2 (y)dy,
0
where fχ2 is the χ2 pdf
&
fχ2 (x) =
1
x(v/2)−1 e−x/2 ,
v/2
Γ(v/2)2
x > 0,
where Γ is the incomplete gamma function. It is easy to show that if
X ∼ χ2v , then E(X) = v and Var(X) = 2v .
Topic 12: Statistical Properties of Descriptive Statistics
c 1999 by H.J. Newton
Copyright %
Slide 2
'
Statistics 626
3. A random variable X is said to have the exponential distribution with
∼ Exp(µ)) if
Z x
fExp (y)dy,
P(X ≤ x) =
mean µ (and we write X
$
0
where fExp is the exponential pdf
fExp (y) =
It can be shown that if X
1 −y/µ
.
e
µ
∼ χ22 , then X/2 ∼ Exp(1).
4. We say a random variable Xn is asymptotically normal with mean a
and variance bn (and write θ
∼ AN (a, bn )) if for large n, the
distribution of
θ̂ − a
Zn = √
bn
is very close to N (0, 1). In such a case, a 100(1 − α)% large
sample confidence interval for a is given by
Xn ± Zα/2
p
bn ,
where Zγ is the value of a N (0, 1) random variable having area α
under the Z curve to its right.
&
5. We reject a null hypothesis H0 (at significance level α) based on the
value t of a test statistic T if
p-value = PH0 (T is at least as extreme as t) < α,
Topic 12: Statistical Properties of Descriptive Statistics
c 1999 by H.J. Newton
Copyright %
Slide 3
'
Statistics 626
where PH0 denotes finding the probability assuming the null
hypothesis is true, and what is meant by “extreme” depends on the
$
particular H0 being tested. For example, if we have a random
sample of n observations from a N (µ, σ 2 ) population and we have
H0 : µ ≤ µ0 , we have that the test statistic is
Z=
X̄ − µ0
√ ,
σ/ n
and so extreme values of Z are any values greater than the value z
we actually observe from our data set. Thus
p-value = P(Z > z),
and we reject H0 if p-value
12.2
< α.
Sample Mean, Covariance, Correlation, and
Spectral Density
If X(1), . . . , X(n) is a realization from a covariance stationary time
series X having mean µ, autocovariance function R, and spectral
density function f , then under mild assumptions
1. For the sample mean X̄n , we have
&
X̄ ∼ AN (µ,
and so
Topic 12: Statistical Properties of Descriptive Statistics
X̄ ± Zα/2
p
f (0)
),
n
f (0)/n
c 1999 by H.J. Newton
Copyright %
Slide 4
'
Statistics 626
is a 100(1 − α)% confidence interval for µ. Notice that for white
$
noise this interval becomes the usual one in introductory statistics as
then f (ω)
= σ 2 . This also shows that if one uses the introductory
statistics confidence interval, then one gets the wrong interval (too
wide or too narrow) unless the time series is white noise. This gives
rise to the idea of equivalent numbers of uncorrelated
observations. Suppose we have a sample of size n from a
covariance stationary time series having variance R(0) and a
sample of size N from a W N series with the same variance R(0).
Then the widths of the confidence intervals are
p
2Zα/2 f (0)/n,
p
2Zα/2 R(0)/N,
respectively, and to have these widths be the same, we have to have
N =n
R(0)
.
f (0)
Thus a non-white noise series of length n has equivalent information
to a white noise series of length nR(0)/f (0). For an AR(1), for
example,
σ2
R(0) =
,
1 − α2
&
σ2
f (0) =
,
1 + α2
which gives
N =n
1+α
,
1−α
and so whether there is more information about µ in an AR(1) than
in a W N series having the same variance depends upon the sign of
Topic 12: Statistical Properties of Descriptive Statistics
c 1999 by H.J. Newton
Copyright %
Slide 5
'
Statistics 626
α (positive means more information since N < n, negative means
less since N > n). Since for an AR(1), ρ(1) = −α, we see that
high negative lag one correlation (α > 0) means the AR series has
much information while a high positive value of ρ(1) means there is
very little information about µ in a data set.
$
2. For the sample autocovariance R̂(v), we have unless all the X ’s
are zero, that
Γ̂M = Toepl(R̂(0), R̂(1), . . . , R̂(M − 1))
is positive definite (that is, for any vector l having M elements which
are not all zero, lT ΓM l
> 0) which means that it is invertible so we
can solve prediction normal equations and Yule Walker equations.
3.
R̂(v) is biased but asymptotically unbiased, that is,
E(R̂(v) − R(v))
4.
=−
|v|
R(v) → 0, n → ∞.
n
R̂(v) is asymptotically normal
R̂(v) ∼ AN (R(v), Vn (v)),
where
&
Vn (v) =
=
Topic 12: Statistical Properties of Descriptive Statistics
∞
1 X 2
R (r) + R(r − v)R(r + v)
n r=−∞
Z
2 1
cos2 2πvωf 2 (ω)dw
n 0
c 1999 by H.J. Newton
Copyright %
Slide 6
'
5.
$
Statistics 626
ρ̂(v) = R̂(v)/R̂(0) is asymptotically normal
ρ̂(v) ∼ AN (ρ(v), Wn (v)),
where
Wn (v) =
"
∞
X
1
ρ2 (r) + ρ(r − v)ρ(r + v)
n r=−∞
#
−4ρ(v)ρ(r)ρ(r + v) + 2ρ2 (v)ρ2 (r)
=
2
nR2 (0)
Z
1
[cos 2πvω − ρ(v)]2 f 2 (ω)dω
0
6. The values of the sample autocorrelation function for different lags
are not themselves uncorrelated, that is,
Cov(ρ̂(v1 )
,
+
ρ(r − v2 )ρ(r + v1 ) − 2ρ(v1 )ρ(r)ρ(r + v2)
−
2ρ(v2 )ρ(r)ρ(r + v1 ) + 2ρ(v2 )ρ(v1 )ρ2 (r)
Z 1
2
cos 2πv1 ω − ρ(v1 )
2 (0)
nR
0
× cos 2πv2 ω − ρ(v2 ) f 2 (ω)dω
=
&
∞ 1 X
ρ̂(v2 )) ≈
ρ(r + v2 )ρ(r + v1 )
n r=−∞
∼ W N (σ 2 ), then for large n, the ρ̂(v)’s are independent
N (0, 1/n).
7. If X
Topic 12: Statistical Properties of Descriptive Statistics
c 1999 by H.J. Newton
Copyright %
Slide 7
'
Statistics 626
8. For the sample spectral density function fˆ we have
Z
E(fˆ(ω)) =
$
1
Fn (ω − τ )f (τ )dτ → f (ω),
0
where the function
Fn (ω) =
1
n
sin πnω
sin πω
2
is called the Féjer kernel.
9. For any fixed M frequencies ω1 , . . . , ωM in (0,.5), the random
variables
2fˆ(ωM )
2fˆ(ω1 )
,...,
f (ω1 )
f (ωM )
are asymptotically independent
Var(fˆ(ω))
χ22 . Thus
→ f 2 (ω),
ω ∈ (0, .5),
which means that fˆ(ω) is not a consistent estimator of f (ω), in fact
fˆ(ω) gets no closer to f (ω) on the average as n gets big.
10. If X
&
∼ W N (σ 2 ), then the fˆ/σ 2 ’s at the natural frequencies are
approximately independent exponentially distributed with mean one.
Topic 12: Statistical Properties of Descriptive Statistics
c 1999 by H.J. Newton
Copyright %
Slide 8