Download handout version - WSU Department of Mathematics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Math/Stat 370: Engineering Statistics,
Washington State University
Haijun Li
[email protected]
Department of Mathematics
Washington State University
Week 6
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
1 / 22
Relation Between Populations and Samples
196
CHAPTER 6 RANDOM SAMPLING AND DATA DESCRIPTION
Population
µ
σ
Sample (x1, x2, x3,…, xn)
x, sample average
s, sample standard
deviation
Histogram
Figure 6-3 Relationship between a population and a sample.
Haijun Li
x
Math/Stat 370: Engineering Statistics, Washington State University
x
s
Week 6
2 / 22
Point Estimation
Point Estimator θ̂: A statistic that can be used to estimate
the unknown parameter θ.
Point estimate: A single value of θ̂.
Bias: E(θ̂) − θ.
Unbiased Estimator: E(θ̂) = θ.
Example:
1
X =
Pn
i=1
n
Xi
is unbiased for mean µ:
E(X ) = E
2
Haijun Li
S2 =
Pn
i=1 (Xi −X )
n−1
2
Pn
i=1 Xi
n
Pn
=
i=1 E(Xi )
n
=
nµ
= µ.
n
is unbiased for variance σ 2 .
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
3 / 22
may be different. Figure 7-1 illustrates the sit
Minimum Variance Unbiased
ˆ Estimator
the estimator 1 is more likely to produce an
ciple of estimation, when selecting among se
Mean Square Error (MSE):
θ̂) =variance.
E(θ̂ − θ)2 .
hasMSE(
minimum
MSE(θ̂) = V (θ̂) + (bias2 ).
For unbiased estimator
Definition θ̂: MSE(θ̂) = V(θ̂).
we considerUnbiased
all unbiased
estimators o
Minimum Variance UnbiasedIfEstimator:
estimator
with the smallest variance. called the minimum variance unbiased
^
Distribution of Θ 1
Figure 7-1 The
sampling distributions
of two unbiased estimaˆ and ˆ .
tors 1
2
Haijun Li
^
Distribution of Θ 2
θ
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
4 / 22
Examples
Let X1 , . . . , Xn be i.i.d. with mean µ and variance σ 2 .
Example 1: X is the minimum variance unbiased estimator
for mean µ.
Example 2: Compare the following two estimators
X1 + X2 + X3 e
3X1 − X2
X =
,X =
.
3
2
Solution: X is unbiased, and
3E(X ) − E(X )
2µ
1
2
e =
E X
=
= µ,
2
2
e is also unbiased. Compare their variances:
and X
2
2
2
σ2
e ) = 3 V (X1 ) + (−1) V (X2 ) = 10σ .
, V (X
3
22
4
Thus, X is better.
V (X ) =
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
5 / 22
Relative Efficiency
Let θ̂1 , and θ̂2 be two estimators of θ (unbiased or not).
If MSE(θ̂1 ) < MSE(θ̂2 ), then θ̂1 is more efficient than θ̂2 .
Example: Let θ̂1 , θ̂2 , and θ̂3 be three estimators of θ with
E(θ̂1 ) = E(θ̂2 ) = θ, E(θ̂3 ) 6= θ.
V (θ̂1 ) = 16, V (θ̂2 ) = 11, MSE(θ̂3 ) = 6.
Compare these estimators.
Solution: Compare their MSEs:
MSE(θ̂1 ) = 16,
MSE(θ̂2 ) = 11,
MSE(θ̂3 ) = 6.
Thus, θ̂3 has the smallest mean square error, even though it is a
biased estimator.
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
6 / 22
Hypothesis Testing
Statistical hypothesis: A statement about the parameters of
one or more populations.
Null Hypothesis H0 : The hypothesis we wish to test.
Alternative Hypothesis H1 : One sided or two sided.
Statistical Test
1
2
3
4
Form the null and alternative hypotheses.
Take a sample from the population.
Compute the test statistic from the sample.
Based on the value of test statistic, make a decision about
the null hypothesis (reject or not).
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
7 / 22
Statistical Tests
Type I error: Reject H0 when it is true.
Type II error: Fail to reject H0 when it is false.
Significance level (size) of the test: α = P(type I error).
Type II probability: β = P(type II error).
Power of the test: 1 − β = P(rejecting H0 when it is false)
9-1 HYPOTHESIS TESTING
281
Table 9-1 Decisions in Hypothesis Testing
x
e-
Decision
H0 Is True
H0 Is False
Fail to reject H0
Reject H0
no error
type I error
type II error
no error
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
8 / 22
A Motivating Example:
The burning rate of solid propellant used to power aircrew
escape systems is a random variable with normal distribution
with SD of 2.5 cm/s. A sample of n = 10 items results in
x = 51.7 cm/s. Construct a size α = 0.05 test for
H0 : µ = 50
H1 : µ 6= 50.
If x is “close” to 50, we should not reject H0 .
If x is “far apart” from 50, say, x < a < 50 < b < x, we
should reject H0 .
Type I: α = P(x < a or x > b, when H0 is true) = 0.05.
Type II: β = P(a ≤ x ≤ b, when H1 is true).
Goal: Find the boundaries a and b based on α.
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
9 / 22
Boundaries = Percentage points
1
2
3
Rejection (or Critical) region: z0 > zα/2 or z0 < −zα/2 .
zα/2 is the 100α/2 percentage point.
If α = 0.05, then z0.05/2 = z0.025 = 1.96.
If α = 0.1, then z0.1/2 = z0.05 = 1.65.
If290
α = 0.01, then
z0.01/2 = z90.005
= 2.58.
CHAPTER
TESTS
OF HYPOTHESES FOR
N(0,1)
Critical region
Critical region
α /2
–zα /2
Haijun Li
α /2
Acceptance
region
0
(a)
zα /2
Math/Stat 370: Engineering Statistics, Washington State University
Z0
Week 6
10 / 22
Inference on Mean (known σ 2)
9.qxd 5/15/02 8:02 PM Page 290 RK UL 9 RK UL 9:Desktop Fo
Hypotheses: H0 : µ = µ0 VS H1 : µ 6= µ0 .
Test the hypotheses with significant level α.
Take a sample X1 , X2 , . . . , Xn .
Test statistic
X − µ0
√
Z0 =
σ/ n
Z0 has (approximately) standard normal distribution.
290 (or Critical)
CHAPTER
9 TESTS
HYPOTHESES FOR A SIN
Rejection
region:
z0 > zOF
α/2 or z0 < −zα/2 .
N(0,1)
Critical region
Critical region
α /2
– zα /2
Haijun Li
α /2
Acceptance
region
0
zα /2
Math/Stat 370: Engineering Statistics, Washington State University
Z0
Week 6
11 / 22
Example:
The burning rate of solid propellant used to power aircrew
escape systems is a random variable with normal distribution
with SD of 2 cm/s. A sample of n = 25 items results in x̄ = 51.3
cm/s. Construct a size α = 0.01 test for
H0 : µ = 50
H1 : µ 6= 50.
Solution: Since α/2 = 0.005, z0.005 = 2.58. Since
Z0 =
X − µ0
1.3
√ =
= 3.25 > 2.58,
2/5
σ/ n
we reject H0 at level α = 0.01.
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
12 / 22
One-Sided Inference Based on Data
X1 , X2 , . . . , Xn
Test statistic
X − µ0
√
σ/ n
Z0 has (approximately) standard normal distribution.
zα is the 100α percentage point.
9:Desktop Folder:
Z0 =
Hypotheses: H0 : µ = µ0 VS H1 : µ > µ0 (upper).
Rejection region: z0 > zα .
2
Hypotheses:
ESES FOR A SINGLE SAMPLE H0 : µ = µ0 VS H1 : µ < µ0 (lower).
Rejection region: z0 < −zα .
1
N(0,1)
N(0,1)
cal region
Critical region
α
Acceptance
region
Z0
0
Haijun Li
zα
α
Z0
Acceptance
region
–zα
Z0
0
Math/Stat 370: Engineering Statistics, Washington State University
(b)
(c)
Week 6
13 / 22
p-Value
p-value p: The smallest level of significance that leads to
reject H0 based on data.
If α ≥ p, we reject H0 at level α.
For the normal distribution,

 2(1 − Φ(|z0 |)) for two-tailed test
(1 − Φ(z0 ))
for upper tailed test
p=

Φ(z0 )
for lower tailed test
Example: For the burning rate problem, z0 = 3.25, and
p = 2(1 − Φ(3.25)) = 0.0012. Since α = 0.01 > p, we reject
H0 at level 0.01.
But if α = 0.001, then we fail to reject H0 at level 0.001
because α = 0.001 < 0.0012 = p.
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
14 / 22
Type I error
z1 48.5 50
1.90
0.79
and
z2 Consider a two-sided hypotheses:
51.5 0.79
Therefore
H0 : µ0 = 50 VS H1 : µ1 = 52.
√
n = 10, σ = 2.5 and
so σ/
n = 0.79.
P1Z
1.902
P1Z 1.902 0.028717 0.028
Suppose that we reject H0 if X ≤ 48.5 or X ≥ 51.5
implies that 5.76% of all random samples would lead to re
(rejectionThis
region).
:
50
per second
H
true mean burning ra
0
α = P(X ≤ 48.5
|µ centimeters
= 50) + P(X
≥ 51.5when
|µ =the50)
per
second.
48.5−50
51.5−50
= P(Z ≤ 0.79 ) + P(Z ≥ 0.79 ) =
0.028717 + 0.028717 = 0.057434.
α /2 = 0.0287
α /2 = 0.0287
48.5
Haijun Li
µ = 50
51.5
X
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
15 / 22
Type II error
Consider a two-sided hypotheses:
283 RK UL 9 RK UL 9:Desktop Folder:
H0 : µ0 = 50 VS H1 : µ1 = 52.
δ = 52 − 50 = 2.
β = P(48.5 ≤ X ≤ 51.5 when µ = 52) = P( 48.5−50
≤Z ≤
0.79
51.5−50
) = P(Z ≤ −0.63) − P(Z ≤ −4.43)
= 0.2643.
9-1 HYPOTHESIS
TESTING
283
0.79
0.6
0.6
0.5
Probability density
0.5
Probability density
Under H1:µ = 52
Under H0: µ = 50
0.4
0.3
0.2
Under H1: µ = 50.5
0.4
0.3
0.2
0.1
0.1
0
46
48
50
52
–
x
HaijunFigure
Li
Under H0: µ = 50
9-3
54
56
0
46
48
50
52
54
56
–x
370: of
Engineering
Statistics, Figure
Washington
University
6
TheMath/Stat
probability
type II error
9-4StateThe
probability of typeWeek
II error
16 / 22
Type II error and Sample Size
Consider a two-sided hypotheses:
H 0 : µ = µ0
H1 : µ 6= µ0 .
δ = µ − µ0 .
For two-sided alternative,
√ √ δ n
δ n
β = Φ zα/2 −
− Φ −zα/2 −
.
σ
σ
(zα/2 + zβ )2 σ 2
n=
.
δ2
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
17 / 22
Examples
1
Example: The burning rate of solid propellant used to power
aircrew escape systems is a random variable with normal
distribution with SD of 2 cm/s. Let α = 0.05. Consider
H0 : µ = 50,
2
VS. H1 : µ 6= 50.
Suppose that sample size n = 25 and µ = 52. Find β.
Solution: Since z0.025 = 1.96,
β = Φ(1.96 − 5) − Φ(−1.96 − 5) ≈ Φ(−3.04) = 0.0012. Assume α = 0.05 and β = 0.10, find sample size n required
to detect the difference δ = 52 − 50 = 2.
Solution: Since z0.025 = 1.96, z0.1 = 1.28, n = 10.50.
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
18 / 22
Confidence Interval
100(1 − α)% Confidence Interval (CI): A random interval
(L, U) such that
P(L ≤ µ ≤ U) = 1 − α.
Observe that
P − zα/2
X −µ
√ ≤ zα/2 = 1 − α
≤
σ/ n
where zα/2 is the 100α/2 percentage point. Rewrite this as
zα/2 σ
zα/2 σ P X− √ ≤µ≤X+ √
= 1 − α.
n
n
Thus, a 100(1 − α)% CI for µ when σ 2 is known:
zα/2 σ
zα/2 σ
x̄ − √ ≤ µ ≤ x̄ + √ .
n
n
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
19 / 22
Confidence Bounds
252
CHAPTER 8 STATISTICAL INTERVALS FOR A SINGLE SAMPLE
E = error = x – µ
Figure 8-2 Error in
estimating with x.
l = x – zα /2 σ / n
x
µ
u = x + zα /2 σ / n
The length of a confidence interval is a measure of the prec
preceeding discussion, we see that precision is inversely
related to
A 100(1 − α)%
upper confidence bound for µ when σ 2 is
sirable to obtain a confidence interval that is short enough for d
known:
that also has adequate confidence.
One way to achieve this is by
zα σ
be large enough
CI of
µ ≤to x̄give
+ a√
. specified length or precision with
n
A 100(1
− of
α)%
lowerSize
confidence bound for µ when σ 2 is
8-2.2
Choice
Sample
known:
Haijun Li
zα confidence
σ
The precision
√ ≤ µ. interval in Equation 8-7 is 2z
x̄ of
− the
using x to estimate n, the error E 0 x 0 is less than
confidence 100(1 ). This is shown graphically
in Fig.208-2.
I
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
/ 22
Example
Consider the propellant problem with SD of 2 cm/s. A sample of
n = 25 items results in x̄ = 51.3 cm/s.
1
Find a 95% CI for mean µ.
Solution: Since z0.025 = 1.96, 95% CI =
(51.3 − 0.78, 51.3 + 0.78) = (50.32, 52.08).
2
Find a 95% upper confidence bound for mean µ.
Solution: Since z0.05 = 1.65, 95% upper CB =
51.3 + 0.66 = 51.96.
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
21 / 22
Sample Size
The sample size required for the error |x̄ − µ| ≤ E with
100(1 − α)% confidence is given by
n=
z
α/2 σ
2
E
Example: Consider the propellant problem with σ = 2. Find
the sample size required for the error to be less than 1.5
cm/s with 95% confidence.
Solution: Since 1 − α = 0.95 and α = 0.05. Thus
z0.025 = 1.96. Since E = 1.5,
2
1.96 × 2
n=
= 6.83.
1.5
Haijun Li
Math/Stat 370: Engineering Statistics, Washington State University
Week 6
22 / 22
Related documents