Download Document

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Foundations of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
5.1 Overview
Comparing
σ1 and σ2
known
Independent
sample
σ1 and σ2
unknown
Means
Two sample
problem
Comparing
Variances
Paired
samples
Comparing
Proportions
s12
m1
Sample 1
x11, x12, ,,,, x1n1
s 22
m2
Sample 2
x21, x22, ,,,, x2n2
1
5.2. Inference on the Means of Two
Independent Populations,
Variance Known
2
Assumptions





X11, X12, …, X1n1 is a random sample of size n1 from population 1
X21, X22, …, X2n2 is a random sample of size n2 from population 2
The two populations are independent .
Variances of twp populations are known.
Both populations are normal, or if they are not normal, the conditions of the
central limit theorem apply
Notations
3
Point Estimator

Point estimator of m1-m2:

Standard Error of

Distribution
4
Confidence Interval

100(1-a)% confidence interval for m1-m2

100(1-a)% upper confidence bound for m1-m2

100(1-a)% lower confidence bound for m1-m2
5
Example Tensile strength tests were performed on two different grades of aluminum
spars used in manufacturing the wing of a commercial transport aircraft. From past
experience with the spar manufacturing process and the testing procedure, the
standard deviations of tensile strengths are assumed to be known. The data obtained is
given below. If m1 and m2 denote the true mean tensile strengths for the two grades of
spars, find a 90% CI on the difference in mean strength m1-m2
+
90% CI: (12.22, 13.98)
6
Hypothesis Testing




Test statistics for testing H0: m1-m2=△0
Alternative hypothesis H1: m1-m2≠△0

Rejection Region: Z0 > za/2 OR Z0 < - z a/2

P-value: 2 P( Z0 > | z0 | )
Alternative hypothesis H1: m1-m2>△0

Rejection Region: Z0 > za

P-value: P( Z0 > z0 )
Alternative hypothesis H1: m1-m2<△0

Rejection Region : Z0 < - za

P-value: P( Z0 < z0 )
7
Solution) See Text p219, example 5-1
8
Example Two types of plastic are suitable for use by an electronics component
manufacturer. The breaking strength of this plastic is important. It is known that
s1= s2 = 1.0 psi. From a random sample of size n1=10 and n2 =12, we obtain 𝑥1
=165.7 and 𝑥2 =155.4. The company will not adopt plastic 1 unless its mean
breaking strength exceeds that of plastic 2 by at least 10 psi. Based on the sample
information, should they use plastic 1? Use a=0.05 in reaching a decision.
(sol)
We are testing H0: m1 - m2  10
H1: m1 - m2 > 10 .
vs.
The test statistic is
𝑍0 =
𝑋1 − 𝑋2 − Δ0
𝜎1 2 𝜎2 2
𝑛1 + 𝑛2
i) Rejection region:
𝑧0 =
𝑥1 − 𝑥2 − Δ0
𝜎1 2 𝜎2 2
𝑛1 + 𝑛2
Reject H0 if z0 > 𝑧0.05 (= 1.64)
=
165.7 − 155.4 − 10
1.0 1.0
10 + 12
= 0.7
< 𝑧0.05 = 1.64
9
Since 0.7 < 1.64, we do not reject the null hypothesis significantly at a = 0.05 and
conclude they should not adopt plastic 1.
ii) P-value approach:
P-value = P(Z > 0.7) = 0.242 > a (= 0.05 )
Since the p-value < a , we do not reject the null hypothesis significantly at a = 0.05
and conclude they should not adopt plastic 1.
10
5. 3. Inference on the Means of Two
Populations, Variance Unknown
11
Assumptions





X11, X12, …, X1n1 is a random sample of size n1 from population 1
X21, X22, …, X2n2 is a random sample of size n2 from population 2
The two populations are independent .
Variances of twp populations are unknown.
Both populations are normal, or if they are not normal, the conditions of the
central limit theorem apply
12
Case 1: The variances are assumed equal
(s12 = s22 = s2)

Combine the two sample variance S12 and S22 to form an estimator of s2
: pooled variance estimator

The test statistic is
T-dist with df=n1+n2-2
Note: if the sample standard deviations are quite different it is not proper to
use this procedure.
13
Confidence Interval

100(1-a)% confidence interval for m1-m2

100(1-a)% upper confidence bound for m1-m2

100(1-a)% lower confidence bound for m1-m2
14
Hypothesis Testing




Test statistics for testing H0: m1-m2=△0
Alternative hypothesis H1: m1-m2≠△0

Rejection Region: T0 > ta/2, n1+n2-2 OR T0 < - t a/2, n1+n2-2

P-value: 2 P( T0 > | t0 | )
Alternative hypothesis H1: m1-m2>△0

Rejection Region: : T0 > ta,n1+n2-2

P-value: P( T0 > t0 )
Alternative hypothesis H1: m1-m2<△0

Rejection Region: : T0 < - ta, n1+n2-2

P-value: P( T0 < t0 )
15
Example 5-4 (Text pp227) Two catalysts are being analyzed to determine how they
affect the mean yield of a chemical process. Specifically, catalyst 1 is currently in use
but catalyst 2 is acceptable. Because catalyst 2 is cheaper, it should be adopted,
providing it does not change the process yield. A test is run in the pilot plant and
results in the data shown in the following table. Is there any difference between the
mean yields? Use a=0.05 and assume equal variances.
16
Example A consumer organization collected data on two types of automobile
batteries, A and B. The summary statistics for 12 observations of each type are
𝒙𝟏 = 36.51, 𝒙𝟐 = 34.21, sA=1.43 and sB =0.93. Assume that the data are normally
distributed with sA= sB
A.
Is there evidence to support the claim that type A battery mean life exceeds that
of type B? Use a significance level of 0.01 in answering this question.
(sol)
H0: mA - mB = 0 Versus
The test statistic is
t0 
( x1 - x2 ) -  0
sp
sp 
t0 
(n1 - 1)s12  (n 2 - 1)s 22
n1  n 2 - 2
(36.51 - 34.21) - 0
H1: mA - mB > 0
1
1

n1 n2

11(1.43) 2  11(0.93) 2
 1.206
22
 4.67
1
1
1.206

12 12
17
P-value = P(t > 4.67)<0.0005 for t-distribution with d.f.=22.
Since p-value < 0.05, we reject the null hypothesis and conclude that the mean
battery life of Type A significantly exceeds that of Type B.
B.
Construct a one-sided 99% confidence bound for the difference in mean
battery life. Explain how this interval confirms your finding in part A.
(sol)
99% lower-side confidence bound: t0.01,22 = 2.508
 x1 - x2  - ta ,n1  n2 -2 (s p )
1
1

 μ A - μB
n1 n2
(36.51 - 34.21) - 2.508(1.206)
1
1

 m A - mB
12 12
1.065  mmAA --mmB
Since zero is not contained in this interval, we conclude that the null hypothesis can be
rejected and the alternative accepted. The life of Type A is significantly longer than that
of Type B.
18
Case 2: s12 ≠ s22

The test statistic
is distributed approximately as t with degrees of freedom given by
=> rounded down to the nearest
integer
19
Confidence Interval

100(1-a)% confidence interval for m1-m2

100(1-a)% upper confidence bound for m1-m2

100(1-a)% lower confidence bound for m1-m2
20
Hypothesis Testing

Test statistics for testing H0: m1-m2=△0

Alternative hypothesis H1: m1-m2≠△0



Rejection Region: T0 > ta/2, n OR T0 < - t a/2, n

P-value: 2 P( T0 > | t0 | )
Alternative hypothesis H1: m1-m2>△0

Rejection Region: T0 > ta,v

P-value: P( T0 > t0 )
Alternative hypothesis H1: m1-m2<△0

Rejection Region : T0 < - ta, v

P-value: P( T0 < t0 )
21
Example Two suppliers manufacture a plastic gear used in a laser printer. The
impact strength of these gears measured in foot-pounds is an important
characteristic. A random sample of 10 gears from supplier 1 results in 𝒙𝟏
=289.30 and s1=22.5 and another random sample of 16 gears from the second
supplier results in 𝒙𝟐 = 321.50 and s2=21.
A.
Is there evidence to support the claim that supplier 2 provides gears with
higher mean impact strength? Use a=0.05 and assume that both populations
are normally distributed but the variances are not equal.
(sol)
H0:m1-m2=0 Versus H1: m1- m2<0
The test statistic is
( x - x2 ) -  0
t0  1
s12 s22

n1 n2
2
 s12 s22 
  
n n
n   12 2  2  18.23
 s12   s22 
   
 n1    n2 
n1 - 1 n2 - 1
n  18
22
t0 
(289.30 - 321.5) - 0
2
(22.5)
(21)

10
16
 -3.65
2
P-value = P(t < -3.65): P-value < 0.0005. Since p-value < 0.05, we reject the
null hypothesis and conclude that supplier 2 provides gears with higher mean
impact strength.
B.
Do the data support the claim that the mean impact strength of gears from
supplier 2 is at least 25 foot pounds higher than that of supplier 1? Make the
same assumptions as in part A.
H0: m2 - m1 = 25 versus H1: m2 - m1 > 25 or m2 > m1 + 25
t0 
(321.5 - 289.3) - 25
2
(22.5) (21)

10
16
2
 0.814
P-value = P(t>0.814) : 0.1 < p-value < 0.25. Since p-value > 0.05, we do not reject
the null hypothesis and conclude that the mean impact strength from supplier 2 is not
at least 25 ft-lb higher that supplier 1.
23
5.4 The Paired t-Test
(Comparing Two Population Means)
24
Paired Samples




The observations on the two populations are paired. (ex. repeated measures,
before and after treatment)
Each pair of observations, (X1j, X2j), are taken under homogeneous conditions,
but these conditions may change from one pair to another.
Use difference between paired values D= X1-X2
Advantage: Eliminating variation in a factor other than the difference
between the two populations.
25
Example We are interested in comparing two different types of tips for a
hardness testing machine. This machine presses the tip into a metal specimen
with a known force. By measuring the depth of the depression caused by the
tip, the hardness of the specimen can be determined. Several specimens were
selected at random and half tested with tip 1, half tested with tip 2 and the
independent t-test was applied.
Problem of this procedure
The metal specimens might not be homogeneous in some way that might affect
hardness (e.g. produced in different heats)
=> The observed difference between mean hardness readings for the two tip types
includes hardness difference between specimens.
Solution
Make two hardness readings on each specimen, one with each tip
=> Paired Sample
26
Analysis

Let (X11, X21), (X12, X22),…,(X1n, X2n) be a set of n paired observations of (X1, X2),
where
E[X1]= m1 , Var(X1)= s12 and E[X2]= m2 , Var(X2)= s22

Define Dj=X1j - X2j ( j =1,2,…,n)
=> Reducing the problem as a one sample problem

Assumption: Both X1 and X2 are normally distributed. Then,
Dj ~ N(mD
,
sD2)

Point estimator of mD  m1-m2:

Point estimator of sD2:

Distribution :
t-dist. with d.f.= n-1
27
Confidence Intervals

100(1-a)% confidence interval for m1-m2

100(1-a)% upper confidence bound for m1-m2

100(1-a)% lower confidence bound for m1-m2
28
Example The journal Human Factors (1962, pp.375-380) reports a study in which n=14
subjects were asked to parallel park two cars having very different wheel bases and
turning radii. The time in seconds for each subject was recorded and is given below.
Find the 90% CI for m1-m2 assuming the normality .
29
90% CI for m1-m2 is found as follows
Note that thus CI includes zero. Thus, at the 90% level of confidence
the data do not support the claim that the two cars have different mean
parking times.
30
Hypothesis Testing for paired data

Test statistics for testing H0: m1-m2 =△0

Alternative hypothesis H1: m1-m2≠△0



Rejection Region: T0 > ta/2, n-1 or T0 < - t a/2, n-1

P-value: 2 P( T0 > | t0 | )
Alternative hypothesis H1: m1-m2>△0

Rejection Region: T0 > ta, n-1

P-value: P( T0 > t0 )
Alternative hypothesis H1: m1-m2<△0

Rejection Region: T0 < - ta, n-1

P-value: P( T0 < t0 )
31
Example
• A new drug for inducing a temporary reduction in a patient’s heart rate is to
be compared with a standard drug.
• A paired experiment is run whereby each of 40 patients is administered one
drug on one day and the other drug on the following day.
• The spacing of the two experiments over two days ensures that there’s no
“carryover” effect since the drugs are only temporary effective.
• Nevertheless, the order in which the two drugs are administered is decided in
a random manner so that one patient may have the standard drug followed by
the new drug and another patient may have the new drug followed by the
standard drug.
• To compare the effects of two drugs, the percentage heart rate reductions for
the standard drug xi and the new drug yi was recorded for the 40 subjects.
32
From the result of the experiment (Figure 9.13), we have
To compare the effects, we perform a hypothesis Testing at a=0.01
We can reject H0 at a=0.01 . That is , there is evidence that the
new drug has a different effect from the standard drug.
33
Example The Federal Aviation Administration requires material used to make evacuation
systems retain their strength over the life of the aircraft. In an accelerated life test, the
principal material, polymer coated nylon weave, is aged by exposing it to 1580F for
168 hours. The tensile strength of the specimens of this material is measured before
and after the aging process. The following data (in psi) are recorded.
34
B.
Calculate the P-value for this test.
(sol)
p-value=2P( T0 > 16.32)= 5.41168E-08 (calculated using Excel)
35
36
5.6 Inference on the Ratio of
Variances of Two Normal Populations
37
The Chi Square Distribution

Let X1, …, Xn be a random sample from a normal distribution with unknown mean m
and unknown variance s2. The quantity
has a chi square distribution with n-1 degrees of freedom, abbreviated as c 2 n-1

The probability density function of a chi-square random variable is
where k is the number of degrees of freedom and
38
Figure 4-19. Probability
density functions of several
chi-square distributions
Figure 4-20. Percentage point of the chisquare distribution
39
The F Distribution

Let W and Y be independent chi-square random variables with u and v
degrees of freedom, respectively. Then the ratio
has the F distribution with u degrees of freedom in the numerator and v degrees
of freedom in the denominator.
 It is usually abbreviated as Fu,v
 The probability density function of an F distribution is given by
40
Figure 5-4. Probability density
functions of two F distributions
Figure 5-5. Upper and lower
percentage points of the F
distribution
41
F분포표:
F0.05,n1,n2
42
F분포표:
F0.10,n1,n2
43
Example For an F distribution, find the following:






f
f
f
f
f
f
0.25, 5,10
0.75, 5,10
0.10, 24,9
0.90, 24,9
0.95, 5,10
0.05, 5,10
44

Let X11, X12, …, X1n1 be a random sample from N(m1 , s12 ) and let X21, X22,
…, X2n2 be a random sample from N(m2 , s22 )
Assume that both normal populations are independent.
Let S12 and S22 be the sample variances.

Then the ratio


has an F-distribution with n1-1 numerator degrees of freedom and n2-1
denominator degrees of freedom.
45
Confidence Interval

100(1-a)% confidence interval for s12/s22

100(1-a)% upper confidence bound for s12/s22

100(1-a)% lower confidence bound for s12/s22
46
Example A company manufactures impellers for use in jet-turbine engines. One of the
operations involves grinding a particular surface finish on a titanium alloy component.
Two different grinding processes can be used and both processes can produce parts at
identical mean surface roughness. The manufacturing engineer would like to select the
process having the least variability in surface roughness. A random sample of n1=11
parts from the first process results in a sample standard deviation s1=5.1 microinches.
A random sample of n2=16 parts from the second process results in a sample standard
deviation s2=4.7 microinches. Find a 90% CI on the ratio of the two variances s12/s22.
,where
A 90% CI on s12/s22 is
47
Hypothesis Testing




Test statistics for testing H0: s12  s22
Alternative hypothesis H1: s12 ≠ s22
F0 > fa/2, n1-1,n2-1 OR F0 < f 1-a/2, n1-1,n2-1

Rejection Region:

P-value: 2 min {P( F0 > f0 ), P( F0 < f0 ) }
Alternative hypothesis H1: s12 >s22

Rejection Region: F0 > fa,n1-1,n2-1

P-value: P( F0 > f0 )
Alternative hypothesis H1: s12 <s22

Rejection Region: F0 < f1-a, n1-1,n2-1

P-value: P( F0 < f0 )
48
Figure The F distribution for the test of
H0: s12  s22 with critical values for (a) H1: s12 ≠ s22 (b) H1: s12 >s22 and (c) H1: s12 <s22
49
Example Oxide layers on semiconductor wafers are etched in a mixture of gases to
achieve the proper thickness. The variability in the thickness of these oxide layers is a
critical characteristic of the wafer, and low variability is desirable for subsequent
processing steps. Two different mixtures of gases are being studied to determine
whether one is superior in reducing the variability of oxide thickness. Sixteen wafers
are etched in each gas. The sample standard deviations of oxide thickness are s1=1.96
angstroms and s2=2.13 angstroms, respectively.
Is there any evidence that either gas is preferable? Use a=0.05.
(Sol)
Rejection Region:
Conclusion: Since f0 does not fall into the R.R., we cannot reject H0 at a=0.05.
Therefore, there is no strong evidence to indicate that either gas is preferable.
50
5.7. Inference on Two Population
Proportions
51

Suppose that the two independent random samples of sizes n1 and n2 are taken
from two populations.

Let X1 and X2 represent the number of observations that belong to the class of
interest in samples 1 and 2, respectively.

The estimators of the population proportions
have approximate normal distributions.

The quantity
has approximately a standard normal distribution, N(0,1)

The approximation as reasonable as long as x1, n1-x1, x2 and n2-x2 are all larger
than 5
52
Confidence Intervals

100(1-a)% confidence interval for P1- P2

100(1-a)% upper confidence bound for P1- P2
-1 

100(1-a)% lower confidence bound for P1- P2
 P1 - P2  1
53
Example Legal agreements have been reached whereby if 10% or more of
the building tiles are cracked, then the construction company that
originally installed the tiles must help pay for the building repair costs.
Buildings A revealed a total of 406 cracked tiles out of 6000 tiles.
Another group of buildings, buildings B, of which tiles were cemented
into place with a different resin mixture than that used on building A.
The construction engineers are interested in investigating whether the
two types of resin mixture have different expansion and contraction
properties which affect the chances of the tiles becoming cracked. A
sample of 2000 tiles on buildings B is examined and 83 are found to be
cracked. Let pA and pB be the probabilities that a tile on buildings A
and B becomes cracked, respectively. Construct a 99% confidence
interval for the difference in the probabilities.
54
=0.0120
=0.0404
A two-sided 99% CI for pA-pB is (0.0120, 0.0404).
Note: This CI contains only positive values. => “pA > pB” (at the 99%
confidence level 99%). That is, the resin mix employed on building B seems to
be better than the resin mixture employed on buildings A.
In fact, we are 99% confident that the resin mixture on building A has a
probability of causing a tile to crack between 1.20% and 4.04% larger than the
resin mixture on buildings B.
55
Hypothesis Testing

Test statistics for testing H0: P1 P2
: (pooled estimate)
Note: Since the null hypothesis specifies that P1= P2 , it is appropriate to employ a pooled
estimate of the common success probability.


Alternative hypothesis H1: P1≠ P2

Rejection Region:
Z0 > za/2 or Z0 < - za/2

P-value: 2 P( Z0 > | z0 | )
Alternative hypothesis H1: P1 > P2

Alternative hypothesis H1: P1 <P2

Rejection Region: : Z0 > za

Rejection Region: : Z0 < -za

P-value: P( Z0 > z0 )

P-value: P( Z0 < z0 )
56
Example When polling the agreement with the statement “ The city mayor
is doing a good job.” the local newspaper is also interested in how a
person’s support for this statement may depend upon his or her age.
Therefore the pollsters also gather information on the ages of the
respondents in their random sample. The polling results consist of
n=952 people aged 18 to 39 of whom x=627 agree with the statement,
and m=1043 people aged at least 40 of whom y=421 agree with the
statement. Does the strength of support for the statement differ
between the two age groups? Use the significance level 0.01.
Let pA be the proportion of the younger group who agree with the
statement and pB be that of the older group. Then the estimates are
We are testing
versus
57
The pooled estimate of a common proportion is
The test statistic value is
The rejection region is
Since Our z0 falls in the R.R, we reject Ho at a=0.01. The poll has
demonstrated a difference in agreement with the statement between the two
age groups.
58