Download Chapter 21: Two-Sample Problems Comparing means of two

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chapter 21: Two-Sample Problems
Comparing means of two independent samples.
The goal of inference is to compare the responses to two treatments or to compare
the characteristics of two populations.
Test: the claim that subjects treated with Lipitor have a mean cholesterol level
that is lower than the mean cholesterol level for subjects given a placebo.
Conditions for Inference Comparing Two Means
1. Two SRSs from two distinct populations.
2. The samples are independent.
3. Populations are Normally distributed.
4. The mean and standard deviations of the populations are unknown.
Inferences about two means: Independent samples
(Independent Samples with σ1 and σ2 unknown and not assumed equal.)
Objectives.
(1) Test a claim about two independent population means
(2) Construct a confidence interval estimate of the difference between two independent
population means.
Notation.
For population j = 1, 2
µj = population mean
σj = population standard deviation
nj = size of the sample
xj = sample mean
sj = sample standard deviation
Requirements.
(1) Both samples are simple random samples.
(2) The two samples are independent.
(3) Either or both of these conditions is satisfied:
The two sample size are both large (n1 > 30 and n2 > 30) or both samples
come from populations having normal distributions.
(4) σ1 and σ2 are unknown and it is not assumed that σ1 and σ2 are equal.
There are two tools:
Calculators: Handling formulas.
Hypothesis Test Statistic for Two Means :
Independent Samples
t=
(x1 − x2 ) − (µ1 − µ2 )
√
s21
s2
+ 2
n1
n2
: Test Statistic
where µ1 − µ2 is often assumed to be 0.
Confidence Interval Estimate of µ1 − µ2 :
Independent Samples
(x1 − x2 ) − E < (µ1 − µ2 ) < (x1 − x2 ) + E
√


s21
s2

 E = t∗
+ 2,

n1
n2
where

df = degree of freedom (for t-distribution)



= smaller of (n1 − 1) and (n2 − 1)
Computers: Excel, Geogebra, or other computer soft.
◦ Construct “Confidence Interval” or perform “Hypothesis Test” by GeoGebra.
1. Click View ⇒ Probability Calculator.
2. Probability Calculator pops up.
3. Select Statistics on the toolbar. (Distribution / Statistics)
4. Select your desire inferential tool to analyze your data.
5. Enter your value (summary) in the boxes.
6. The result is shown below your summary.
Ex.
The samples are selected from normally distributed populations. The mean tar content
of a simple random sample of 25 unfiltered king size cigarettes is 21.1 mg, with
a standard deviation of 3.2 mg. The mean tar content of a simple random sample of 25
filtered 100 mm cigarettes is 13.2 mg with standard deviation of 3.7 mg.
Use a 0.05 significance level to test the claim that unfiltered king size cigarettes have a
mean tar content greater than that of filtered 100 mm cigarettes.
What does the result suggest about the effectiveness of cigarette filters?
Four Requirements?
Type
Sample Size
Sample Meas
Sample S. D. (mg)
————————————————————————————————–
Unfiltered
25
21.1
3.2
Filtered
25
13.2
3.7
Solution.
1. Read English and identify the specific claim or hypothesis.
2. Express the specific claim or hypothesis in symbolic form.
H0 : µ1 = µ2
Ha : µ1 > µ2
(µ1 ̸= µ2 )
3. What is the significance level?
α = 0.05
4. Calculate Test Statistics.
t=
(x1 − x2 ) − (µ1 − µ2 )
√
s21
s2
+ 2
n1
n2
: Test Statistic
where µ1 − µ2 is often assumed to be 0.
n1 = 25
x1 = 21.1
s1 = 3.2
t=
n2 = 25
x2 = 13.2
s2 = 3.7
(21.1 − 13.2) − (µ1 − µ2 )
7.9
√
=
= 8.074687922 = 8.075
0.97837
3.22
3.72
+
25
25
5. Convert the test statistics to p-value and compare with the significance level.
P-value is less than 0.005
α = 0.05
6. Reject H0 or do not reject, and conclude with your English words.
Reject H0 .
There is sufficient evidence to support the claim that unfiltered king size cigarettes
have a mean tar content greater than that of filtered 100 mm cigarettes.
Ex.
Refer to the previous example, construct a 90% confidence interval estimate of
the difference between the mean tar content of unfiltered king size cigarettes and the mean
tar content of filtered 100 mm cigarettes.
Confidence Interval Estimate of µ1 − µ2 :
Independent Samples
(x1 − x2 ) − E < (µ1 − µ2 ) < (x1 − x2 ) + E
√


s21
s22

∗

+
,
E
=
t

n1
n2
where

df = degree of freedom (for t-distribution)



= smaller of (n1 − 1) and (n2 − 1)
Solution.
α = 0.1
E = tα/2
√
df = 24
tα/2 = 1.711
s21
s22
+
= 1.711(0.97837) = 1.67399
n1
n2
(x1 − x2 ) − E < (µ1 − µ2 ) < (x1 − x2 ) + E
(21.1 − 13.2) − 1.67399 < (µ1 − µ2 ) < (21.1 − 13.2) + 1.67399
6.22601 < (µ1 − µ2 ) < 9.57399
6.226 < (µ1 − µ2 ) < 9.574
⇒ µ1 ̸= µ2 since 0 is not in the confidence interval.
Because the confidence interval limits include only positive
values, it appears that unfiltered king size cigarettes have
a mean tar content greater than that of filtered 100 mm
cigarettes.