Download Comparing Two Samples - Dixie State University :: Business

Document related concepts

Sufficient statistic wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

German tank problem wikipedia , lookup

Omnibus test wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
William Christensen, Ph.D.
Comparing Two Samples
In the last section we learned how to test
hypotheses, or in other words, use statistics
to test a claim about a population parameter.
We build on that knowledge in this section
and learn how to compare two sets of
sample data. That is, we learn how to test a
claim that two samples come from the same
(or different) populations.
Comparing two means
Large Independent
Samples (n>30)
Assumptions
1. The two samples are independent


The samples are not related or paired with each
other
If the samples are related (dependent) they are
often referred to as matched pairs or paired
samples – we’ll learn to deal with them later
2. The two sample sizes are large.
3.
That is, n1 
30 and n2  30.
Both samples are random samples
Test Statistic for Comparing
Two Population Means

The “test statistic” for comparing the two population means is
calculated as follows (there is no Excel shortcut):

The formula requires:


The means of both samples (x-bar1 and x-bar2)
The population (or sample) standard deviations from each sample



Most of the time we don’t have σ, so we use s
The sample size (n) for each sample
Note: we ALWAYS assume that the population means (µ1 and µ2) net
out to 0. In other words, always assume that (µ1 - µ2) = 0
x1  x2    1   2 

z
 12
n1

 22
n2
Critical Value(s) for Comparing
Two Population Means


The “critical value(s)” for comparing two population means are
found exactly the same as in Section 5 (hypothesis testing of
means with large samples) – see the next slide for a review of how
to find those critical values using Excel
We again use HYPOTHESIS TESTING when we compare two
population means. The 3 possible sets of hypotheses are:
1. H0: µ1 = µ2 and H1: µ1 ≠ µ2 (two-tailed test)
2. H0: µ1  µ2 and H1: µ1  µ2 (right-tail test)
3. H0: µ1  µ2 and H1: µ1  µ2 (left-tail test)
µ Critical Value Example
α = 0.05
CV = -1.96
CV = 1.96
Area = α/2 =
0.025
Two-tail test
Area = α/2 =
0.025
Right-tail test
Left-tail test
CV = -1.645
Null
Hypothesis
Alternative
Hypothesis
=


≠


CV = 1.645
Area = α =
0.05
Area = α =
0.05
Notice how critical values for the left-tail are ALWAYS
negative and critical values for the right-tail are
ALWAYS positive
Comparing Two Means:
EXAMPLE
The Coke vs. Pepsi data set on the class
website includes the weights (in pounds) of
samples of regular Coke and regular Pepsi.
Sample statistics are shown. Use the 0.01
significance level to test the claim that the
mean weight of regular Coke is the same as
the mean weight of regular Pepsi (claim is
H0: µcoke = µpepsi and H1: µcoke ≠ µpepsi )
Coke Versus Pepsi Example
The following info (you can download data off website or
the CD-ROM that comes with the Triola text) can easily be
calculated from the data sets using Excel mean/average
and standard deviation functions.
Regular Coke
Regular Pepsi
n
36
36
x-bar
0.81682
0.82410
s
0.007507
0.005701
Coke Versus Pepsi Example
Claim: µ1 = µ2 with α = 0.01
H0 : µ1 = µ2 and H1 : µ1  µ2
Accept H1
Z = - 2.575
Accept H0
z=0
Accept H1
Z = 2.575
Change left-side neg.
value to pos. or use
=NORMSINV(0.995)
Coke Versus Pepsi Example

Next, calculate the test statistic using the formula:
z
 x1  x2    1  2 
 12
n1

 22
n2
Z = - 2.575
z=0
Z = 2.575
Conclusion: Accept H1: µ1 ≠ µ2 because the test statistic is in
the critical region (outside the critical value)
Coke Versus Pepsi Example

Finally, we conclude that the mean weights of Coke and Pepsi are different
(not equal to each other). In fact, you can see from the sample means that
Pepsi weighs more than Coke, and we found that difference to be
statistically significant.
Z = - 2.575
z=0
Z = 2.575
Comparing two means
Small Independent
Samples (n30)
Assumptions
1. The two samples are independent
2. At least one of the sample sizes is small.
That is, n1  30 OR n2  30.
3. Both samples are random samples from
normally distributed populations
Test Statistic for Comparing
Two Population Means

The “test statistic” for comparing two population means where at
least one sample size  30 is calculated as follows:

The formula requires:

The means of both samples (x-bar1 and x-bar2)

The sample standard deviations from each sample

The sample size (n) for each sample, where df = smaller of n1-1 or n2-1

Note: we ALWAYS assume that the population means (µ1 and µ2) net out
to 0. In other words, always assume that (µ1 - µ2) = 0
x1  x2    1  2 

t
s12 s22

n1 n2
Critical Value(s) for Comparing
Two Population Means


The “critical value(s)” for comparing two population means are
found exactly the same as in Section 5 (hypothesis testing of
means with small samples) – we use Excel function
TINV(probability,degrees_freedom) - see the next slide for a review
and example. Important note: we use the full alpha value (not
alpha/2) for a 2-tailed test and must use 2 x alpha for a 1-tailed ttest.
We again use HYPOTHESIS TESTING when we compare two
population means. The 3 possible sets of hypotheses are:
1. H0: µ1 = µ2 and H1: µ1 ≠ µ2 (two-tailed test)
2. H0: µ1  µ2 and H1: µ1  µ2 (right-tail test)
3. H0: µ1  µ2 and H1: µ1  µ2 (left-tail test)
µ Critical Value Example
α = 0.05, smaller sample size =15
CV = -2.14
CV = 2.14
Area = α/2 =
0.025
Two-tail test
Area = α/2 =
0.025
Right-tail test
Left-tail test
Note: you must make leftside t-values negative
CV = -1.76
Null
Hypothesis
Alternative
Hypothesis
=


≠


CV = 1.76
Area = α =
0.05
Area = α =
0.05
Notice how critical values for the left-tail are ALWAYS
negative and critical values for the right-tail are
ALWAYS positive
Comparing Two Means:
EXAMPLE
People spend huge sums of money for the purchase of magnets to treat pain.
Researchers conducted a study to determine whether magnets are effective in
treating back pain. Pain was measured using the visual analog scale, with the
results given below (larger numbers mean more effective pain reduction). Use
α=0.05 to test the whether those treated with magnets had greater pain
reduction than those given a fake/sham treatment (similar to a placebo). Does
it appear that magnets are effective in treating back pain? How might larger
samples effect our results?

Reduction in pain level after magnet treatment: n=20, mean=0.49, s=0.96

Reduction in pain level after sham treatment: n=20, mean=0.44, s=1.4
Note: our hypotheses (related to the amount of pain reduction) are:

H0: µmagnet  µsham

H1: µmagnet  µsham )
Magnets & Pain Reduction
Example
α=0.05 (use 2x α for 1-tail test), df=smaller of n1-1 or n2-1 =20-1=19
H0 : µmagnet  µsham and H1 : µmagnet  µsham (right-tail test)
Accept H0
t=0
Accept H1
Z = 1.729
Magnets & Pain Reduction
Example

Next, calculate the test statistic using the formula:
t
 x1  x2    1  2 
s12 s22

n1 n2
t=0
Conclusion: Accept H0: µmagnet  µsham
Z = 1.729
Magnets & Pain Reduction
Example
Conclusion: Accept H0: µmagnet  µsham
The results do not support the claim that magnets are
effective in reducing back pain. Even though the mean
pain reduction with magnets (0.49) was greater than with
the sham treatment (0.44), the standard deviations were
very high (0.96 and 1.4), making it difficult to statistically
find any meaningful difference between the means. Even
with larger samples, unless the standard deviations came
down, it would be difficult to show a difference between the
means.
Comparing two means
Matched pair Samples
Assumptions
1.
The sample data consist of matched pairs.
2.
The samples are simple random samples.
3.
If the number of pairs of sample data is small (n  30),
then the population of differences in the paired
values must be approximately normally distributed.
Note: taking “before-and-after” measurements to create
and then compare two sample sets is a typical
example of matched pairs
Notation for Matched Pairs
• µd = mean value of the differences d between the
population of paired data
• d-bar = mean value of the differences d between the
paired sample data (e.g., the average difference between
“before” and “after” measurements)
• sd = standard deviation of the differences d for the paired
sample data
• n = number of pairs of data
Test Statistic for Comparing
Matched Pairs

The “test statistic” for comparing matched pairs is calculated as follows (there is
no Excel shortcut):

The formula requires:




The mean difference between the matched pairs from the two samples (often a “before” and
an “after” sample), represented by d-bar
Usually, the hypotheses assume there is no difference between the populations. In other
words, the most common null hypothesis for matched pairs is H0: µd = 0. Whether µd = 0 or
some other value, the value stated in the hypotheses is what goes into the test statistic
formula. Of course, with µd =0, µd simply drops out of the formula.
The standard deviation of the differences between the matched pairs of the two samples (see
Example)
n represents the number of matched pairs we have
d  d
t
sd
n
Critical Value(s) for Testing
Matched Pairs

The “critical value(s)” for comparing matched pairs are
found as follows:



Use Excel function =NORMSINV(probability) when n>30
Use Excel function =TINV(probability,df) when n  30, where df
(degrees of freedom) = n – 1 (note: for TINV ALWAYS use α
for 2-tailed tests and αx2 for 1-tailed tests)
We use HYPOTHESIS TESTING to compare matched
pairs. The 3 most common hypotheses are:
1. H0: µd = 0 and H1: µd ≠ 0 (two-tailed test)
2. H0: µd  0 and H1: µd  0 (right-tail test)
3. H0: µd  0 and H1: µd  0 (left-tail test)
Do Male Students Exaggerate
Their Heights? EXAMPLE
Use the following data to test the claim that male students
exaggerate their heights (i.e., the difference between what they
report and their actual heights is greater than 0). Use a 95%
confidence level.
7.9
This value is so large that it seems there must be some kind of mistake. How could
anyone reasonably say the are almost 8 inches taller than they really are. Because
of this fact, and the fact that such an abnormally large value would drastically
affect our results, we toss out this “outlier” and are left with 11 useable differences
d-bar and std.dev of d do
not include the outlier value
Male Height Example
Claim: µd  0 with α = 0.05
H0 : µd  0 and H1 : µd  0 (right-tail test)
Accept H0
Accept H1
Step 1: write out
the hypotheses,
graph the problem,
and find the critical
value
t=0
t = 1.81
Male Height Example
Step 1: calculate the test statistic and form conclusion
d  d
t
sd
n
Accept H0
Accept H1
Conclusion: Accept H1
The test statistic is outside the critical
value (in the critical region)
t=0
t = 1.81
Male Height Example
Conclusion:
Accept H1: µd  0. In other
words, we find there is a
statistically significant
difference between what
male students say their
height is compared to their
actual height.
Accept H0
t=0
Accept H1
t = 1.81
Comparing two proportions
Assumptions
1. We have proportions from two
independent random samples
2. For both samples, the conditions
n*p  5 and n*q  5 are satisfied
 See Section 5, Hypothesis Testing
for Population Proportions for a
review
Notation for Proportions
•
•
•
•
•
p = population proportion (always between 0 and 1)
p-hat = p̂ = sample proportion
ˆ
q-hat = qˆ  1  p
n = size of the sample
x = number of successes in the sample. Sometimes we
are given p-hat directly and sometimes we must calculate
p-hat by using the simple formula p-hat = x / n. (e.g., if 12
out of 24 cars are volkswagens, then the proportion of
volkswagens in the sample is x/n or 12/24 = 0.50)
• With two samples we mark these variables as 1 or 2,
which designates which sample they came from
Notation for Proportions
• When comparing two population proportions, our test
statistic requires us to calculate a “pooled estimate” of
p1 and p2, which we call p-bar
• The formula for p-bar is as follows. You should note this
formula (along with all test statistic formulas) since you
must know it for the exam.
x1  x2
p
n1  n2
q  1 p
Test Statistic for Comparing
Population Proportions

The “test statistic” for comparing two population proportions is
calculated as follows (there is no Excel shortcut):

The formula requires:




The sample proportion (p-hat) from each sample
ALWAYS assume there is no difference between the population proportions.
In other words, always assume p1 - p2 = 0 in the formula
The pooled p estimate (p-bar) and q-bar calculated using the formulas in the
previous slide
The size of each of the two sample (n1 and n2)
pˆ1  pˆ 2    p1  p2 

z
pq pq

n1
n2
Critical Value(s) for Comparing
Two Population Proportions


The “critical value(s)” for comparing two population
proportions are found as exactly the same as any other
z-values, using Excel function NORMSINV(probability):
We use HYPOTHESIS TESTING to compare two
population proportions. The 3 possible hypotheses are:
1. H0: p1 = p2 and H1: p1 ≠ p2 (two-tailed test)
2. H0: p1  p2 and H1: p1  p2 (right-tail test)
3. H0: p1  p2 and H1: p1  p2 (left-tail test)
Desire for Marriage EXAMPLE
In a Time/CNN survey, 24% of 205 single women said that they
“definitely want to get married.” In the same survey, 27% of 260
single men gave that same response. Using α = 0.05 test the
claim that there is no difference between single men and single
women regarding their desire to get married.
H0: pmen = pwomen (who want to get married)
H1: pmen ≠ pwomen (who want to get married)
Desire for Marriage EXAMPLE
α 0.05, H0 : pm = pw and H1 : pm  pw (2-tail test)
Accept H1
Accept H0
Accept H1
Step 1: write out
the hypotheses,
graph the problem,
and find the critical
value
Z = - 1.96
z=0
Z = 1.96
Change left-side neg.
value to pos. or use
=NORMSINV(0.975)
Desire for Marriage EXAMPLE

Next, calculate the test statistic using the formula:
pˆ1  pˆ 2    p1  p2 

z
pq pq

n1
n2
BUT, before we can use this test statistic formula we must first
calculate pooled p (p-bar) and q-bar using the formulas we
learned. We find pooled p = (x1+x2)/(n1+n2), where
x1=0.24*205=49 single women and x2=0.27*260=70 single
men, so pooled p = (49+70)/(205+260)=0.256, so pooled
q=1-(pooled p)=1-0.256=0.744
Desire for Marriage EXAMPLE

Now we can calculate the test statistic using the formula:
pˆ1  pˆ 2    p1  p2 

z
pq pq

n1
n2
Conclusion: Accept H0: pm=pw
Z = - 1.96
z=0
Z = 1.96
Desire for Marriage EXAMPLE

Conclusion: Accept the null hypothesis (H0) that the
population proportions are equal. In other words,
we find no difference between the proportion of
single men and the proportion of single women who
“definitely want to get married”.
Z = - 1.96
z=0
Z = 1.96
Comparing population variances
or standard deviations
Using two samples to
compare population
variances
Assumptions
1. We have variances from two independent random
samples
2. The two populations are each normally distributed
 See Section 5, Hypothesis Testing for Variances
and Standard Deviation for a general review.
However, note that in Section 5 we used the 2
distribution, whereas in this Section we use a
similar, but new distribution called the “F”
distribution
Notation for Variance and
Standard Deviation Testing
•
•
•
•
•
s = standard deviation of sample
σ = standard deviation of population
s2 = variance of sample
σ2 = variance of population
Since we have two samples we are using to compare
two populations, we also label these variables as “1” or
“2”, with the larger sample variance/standard deviation
ALWAYS labeled s1 and the smaller ALWAYS labeled s2.
F - distribution
Not symmetric

nonnegative values only

Use Excel function FINV to find the F critical value. All
one-tailed F tests are right-tailed tests and there are no
negative F-distribution values since the origin is at 0.
F–distribution
Finding Critical Value
Probability = α for one-tailed
tests and α/2 for two-tailed
tests. FOR ALL F-TESTS
(one or two-tailed), the
right critical value is the
ONLY VALUE we need to
check and the only value
returned by the Excel
function FINV
deg_freedom1 = the
degrees of freedom for
sample1, which is
ALWAYS the sample
with the larger
variance/standard
deviation. Degrees of
freedom is simply
sample size minus 1
(df=n-1) for sample 1.
deg_freedom2 = the
degrees of freedom for
sample2, which is
ALWAYS the sample
with the smaller
variance/standard
deviation. Degrees of
freedom is simply
sample size minus 1
(df=n-1) for sample 2.
Critical Value(s) for Comparing Two
Population Variances or
Std. Deviations


1.
2.
3.

The “critical value” for comparing two population variances or
standard deviations are found using the F-distribution just
described and the Excel function FINV(probability, df1, df2):
We use HYPOTHESIS TESTING to compare two population
variances. The 3 possible hypotheses are:
H0: σ1 = σ2 and H1: σ1 ≠ σ2 (right-tail test with probability=α/2)
H0: σ1  σ2 and H1: σ1  σ2 (right-tail test with probability=α)
H0: σ1  σ2 and H1: σ1  σ2 (right-tail test with probability=α)
Note: whether we compare variances or standard deviations, the
results will be the same. Thus, these hypotheses can be written
either for variances or standard deviation (shown here).
F–distribution
Finding Critical Value
Two Examples
F-critical value for α=0.05, one-tail
test (left or right), where the
sample with the larger variance
has a sample size of 30 (df=n1=30-1=29) and the sample with
the smaller variance has a sample
size of 50 (df=n-1=50-1=49)
F-critical value for α=0.05, two-tail
test, where the sample with the
larger variance has a sample size
of 45 (df=n-1=45-1=44) and the
sample with the smaller variance
has a sample size of 40 (df=n1=40-1=39)
Test Statistic for Comparing
Population Variances or
Standard Deviations

The “test statistic” for comparing two population
variances or standard deviations is as follows:

The formula requires:


The standard deviation or variance from each sample
Note: if you are given sample variance (s2), do not square it
again since variance is already the square of standard deviation.
On the other hand, if given standard deviation (s) for each
sample, make sure you square it as shown in the formula
2
1
2
2
s
F
s
Comparing Two Variances
Accept H0 (no difference between
population variances) if the F-test
statistic is less than the F-critical
value
Accept H0
0
Accept H1 (there is a difference
between population variances) if
the F-test statistic is greater than
the F-critical value
Accept H1
Probability (area in the tail)
equals alpha for ALL one-tailed
tests (left or right) and alpha/2
for all two-tailed tests. This area
or probability is the first entry in
the Excel FINV function
F-critical value (FINV)
Discussion:
Comparing Population Variances
or Standard Deviations


If two samples have exactly equal variances, then the F test
statistic would be 1 (see formula – if numerator and
denominator are equal then F=1)
Likewise, if the samples come from populations with
dramatically different variances then we would expect the F
test statistic to be a large number (the larger variance is
always in the numerator, making big variance differences
result in relatively large numbers, much greater than 1)
Coke vs. Pepsi (weight)
Variance EXAMPLE
The Coke/Pepsi data set included in the text and listed on the class website includes
sample weights of regular Coke and Pepsi. Each sample contains 36 observations
(n1=36 and n2=36). The standard deviation for the weight of Coke is 0.007507 lbs.,
and the standard deviation for the weight of Pepsi is 0.005701 lbs. Use an alpha of
0.05 to test whether the variance in the weight of Coke and Pepsi is
significantly/statistically different. Note: since the variance in the weight of Coke
is greater than that of Pepsi, we must list it first and put it in the numerator (top)
of the F test statistic calculation.
H0: σ2coke = σ2pepsi (weight of cans)
H1: σ2coke ≠ σ2pepsi (weight of cans)
Note: this is “technically a two-tailed test, but we only have the ability to find,
and are only interested in, the right-side critical value
Coke vs. Pepsi (weight)
Variance EXAMPLE
Note: both samples are of the
same size (36), so df1=36-1=35,
and df2=36-1=35
Accept H0
Since this is “technically” a twotailed test, the probability (area)
equals alpha/2 = 0.05/2 = 0.025
Accept H1
F critical value = 1.96
0
F-critical value (FINV)
Coke vs. Pepsi (weight)
Variance EXAMPLE
The F test statistic is calculated using the formula:
s12
F 2
s2
Accept H0
Accept H1
Conclusion: Since the F test
statistic is less than the F critical
value, we accept the null
hypothesis (H0) that there is no
statistically significant difference
between the VARIANCES in the
weight of Coke and Pepsi
F critical value = 1.96
0
F-critical value (FINV)
William Christensen, Ph.D.