Download Statistics for Managers Using Microsoft Excel, 4/e

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistics for Managers
Using Microsoft® Excel
4th Edition
Chapter 9
Two Sample Tests
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-1
Chapter Goals
After completing this chapter, you should be able to:
 Compare two independent samples with an
 F test for the difference in two variances
 t test for the difference in two means where  1
 2
 t test for the difference in two means where  1   2
 Test two means from related samples
 Complete a Z test for the difference between two
proportions
 Test for normality
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-2
Two Sample Tests
Two Sample Tests
Population
Variances
Population
Means,
Independent
Samples
Population
Means,
Related
Samples
Population
Proportions
Group 1 vs.
independent
Group 2
Same group
before vs. after
treatment
Proportion 1 vs.
Proportion 2
Examples:
Variance 1 vs.
Variance 2
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-3
The Process
There are two tests to determine if the means of two
populations are different. One test assumes the
variances of the populations are equal the other is
needed if the variances are not equal. Rather than
make assumptions about the variances, I prefer to
perform an F test for the equality of population
variances and use the results to determine which model
should be used. Therefore the F test comes first.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-4
The F Distribution
 Tests for the difference in variances for two
independent populations
 Parametric test procedure using s1 and s2
 Assumptions

Both populations are normally distributed

Test is not robust to this violation
 Samples are randomly and independently drawn
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-5
Hypothesis Tests for Variances
Tests for Two
Population
Variances
F test statistic
H0: σ12 = σ22
H1: σ12 ≠ σ22
H0: σ12  σ22
H1: σ12 < σ22
H0: σ12 ≤ σ22
H1: σ12 > σ22
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Two tailed test
Lower tail test
Upper tail test
Chap 9-6
Hypothesis Tests for Variances
(continued)
Tests for Two
Population
Variances
The F test statistic is:
2
1
2
2
S
F
S
F test statistic
S12 = Variance of Sample 1
n1 - 1 = numerator degrees of freedom
S22 = Variance of Sample 2
n2 - 1 = denominator degrees of freedom
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-7
F Test: An Example
You are a financial analyst for a brokerage firm. You
want to compare dividend yields between stocks listed
on the NYSE & NASDAQ. You collect the following data:
NYSE
NASDAQ
Number
21
25
Mean
3.27
2.53
Std dev
1.30
1.16
Is there a difference in the
variances between the NYSE
& NASDAQ at the  = 0.05 level?
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-8
F Test: Example Solution
(continued)
H0: σ12 = σ22
H1: σ12 ≠ σ22
 The test statistic is:
S12 1.302
F 2 
 1.256
2
S2 1.16
 p-value= .588846
 = .05
 The p-value is not < alpha, so we do not reject H0
 Conclusion: There is not sufficient evidence of a difference in the
variances of dividend yield for the two exchanges
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-9
F test in PhStat
 PHStat | Two-Sample Tests | F Test for
Differences in Two Variances …
 For some problems an incorrect decision is
reached. Ignore the decision and compare the
p-value with alpha.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-10
Hypothesis Tests for
Two Population Means
Two Population Means, Independent Samples
Lower tail test:
Upper tail test:
Two-tailed test:
H0: μ1  μ2
H1: μ1 < μ2
H0: μ1 ≤ μ2
H1: μ1 > μ2
H0: μ1 = μ2
H1: μ1 ≠ μ2
i.e.,
i.e.,
i.e.,
H0: μ1 – μ2  0
H1: μ1 – μ2 < 0
H0: μ1 – μ2 ≤ 0
H1: μ1 – μ2 > 0
H0: μ1 – μ2 = 0
H1: μ1 – μ2 ≠ 0
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-11
Difference Between Two Means
Population means,
independent
samples
σ1 and σ2 unknown
but equal
σ1 and σ2 unknown
but unequal
 Assumptions
 Samples randomly and
Independently Drawn
 Populations are normally distributed
or large sample sizes are used
 Use s to estimate unknown σ
 Use pooled variance t test
where 1=2 or
 individual variance t test
where 12 (proven by F test)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-12
σ1 and σ2 Unknown but Equal
(continued)
Population means,
independent
samples
If an F test does not prove the
variances to be unequal, we
can assume that they are
equal and pool them. The
pooled standard deviation is
σ1 and σ2 unknown
but equal
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Sp 
n1  1S12  n2  1S2 2
(n1  1)  (n2  1)
Chap 9-13
σ1 and σ2 Unknown but Equal
(continued)
Population means,
independent
samples
The test statistic for
μ1 – μ2 is:

X  X   μ  μ 
t
1
σ1 and σ2 unknown
but equal
2
1
2
1 1 
S   
 n1 n2 
2
p
Where t has (n1 + n2 – 2) d.f.,
and
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
S
2
p
2
2

n1  1S1  n2  1S2

(n1  1)  (n2  1)
Chap 9-14
Pooled Variance t Test: Example
You are a financial analyst for a brokerage firm. Is there a
difference in dividend yield between stocks listed on the
NYSE & NASDAQ? You collect the following data:
NYSE NASDAQ
Number
21
25
Sample mean
3.27
2.53
Sample std dev 1.30
1.16
Is there a difference in the mean
dividend yield between the
NYSE & NASDAQ ( = 0.05)?
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-15
Solution
F = 1.256
p = .5888
H0: μ1 = μ2
Do Not Reject
H1: μ1 ≠ μ2
2 = σ 2
σ
1
2
 = 0.05
df = 21 + 25 - 2 = 44
Pooled Test Statistic:
t
3.27  2.53
1 1 
1.5021   
 21 25 
 2.0397
p-value: .047407
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Reject H0
Reject H0
.025
.025
0
t
Decision:
Reject H0 at  = 0.05
Conclusion:
There is evidence of a
difference in mean dividend
yield for the two exchanges.
Chap 9-16
Pooled Variance t Test in
PhStat and Excel
 If the raw data is available
 Tools | Data Analysis | t-Test: Two Sample
Assuming Equal Variances
 If only summary statistics are available
 PHStat | Two-Sample Tests | t Test for Differences
in Two Means …
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-17
Confidence Interval,
σ1 and σ2 Unknown but Equal
Population means,
independent
samples
The confidence interval for
μ1 – μ2 is:
X  X   t
1
σ1 and σ2 known
σ1 and σ2 unknown
but equal
2
Where
S
2
p
n1 n2 -2
1 1 
S   
 n1 n2 
2
p
2
2

n1  1S1  n2  1S2

(n1  1)  (n2  1)
df=n1+n2-2
To get t use function
TINV(1-Confidence, df)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-18
The Unequal-Variance t Test
Population means,
independent
samples
σ1 and σ2 unknown
and unequal
(X 1  X 2)
t
2
2
s1 s2

n1 n2
( s12 / n1  s22 / n2 ) 2
d.f. 
 ( s12 / n1 ) 2 ( s22 / n2 ) 2 



n2  1 
 n1  1
p  TDIST (t , df , tails )
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-19
Unequal Variance t Test: Example
You are a financial analyst for a brokerage firm. Is there a
difference in dividend yield between stocks listed on the
NYSE & NASDAQ? You collect the following data:
NYSE NASDAQ
Number
21
25
Sample mean
3.27
2.53
Sample std dev 2.30
1.16
Is there a difference in the mean
dividend yield between the
NYSE & NASDAQ ( = 0.05)?
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-20
Unequal Variance Solution
Test Statistics:
H0: m1 = m2
H1: m1 m2
 = 0.05
df = 92
Reject H0
.025
F = 3.9313
p = .001794
Reject HO:  1
 2
t = 1.34
p =.1901
Reject H0
.025
Decision:
Fail to Reject at  = 0.05
Conclusion:
There is not sufficient evidence of a
0
t difference in mean dividend yields
.1901
for the two exchanges
Chap 9-21
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Unequal Variance t Test in
PhStat and Excel
 If the raw data is available
 Tools | Data Analysis | t-Test: Two Sample
Assuming Unequal Variances
 If only summary statistics are available
 Use the Excel spreadsheet downloaded from the
Homework web page
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-22
Confidence Interval,
σ1 and σ2 Unknown and Unequal
Population means,
independent
samples
σ1 and σ2 known
The confidence interval for
μ1 – μ2 is:

CI  X 1  X 2
σ1 and σ2 unknown
and unequal

2
1
2
2
s
s
t

n1 n2
df=n1+n2-2
To get t use function
TINV(1-Confidence, df)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-23
Related Samples
Tests Means of 2 Related Populations
Related
samples



Paired or matched samples
Repeated measures (before/after)
Use difference between paired values:
D = X1 - X2
 Eliminates Variation Among Subjects
 Assumptions:
 Both Populations Are Normally Distributed
 Or, if Not Normal, use large samples
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-24
Mean Difference, σD Unknown
(continued)
Paired
samples
If σD is unknown, we can estimate the unknown
population standard deviation with a sample
standard deviation. The test statistic for D is
now a t statistic, with n-1 d.f.:
D  μD
t
SD
n
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
n
SD 
2
(D

D
)
 i
i1
n 1
Chap 9-25
Paired Samples Example
 Assume you send your salespeople to a “customer
service” training workshop to reduce complaints. Is
the training effective? You collect the following data:
Number of Complaints:
(2) - (1)
Salesperson Before (1) After (2)
Difference, Di
C.B.
T.F.
M.H.
R.K.
M.O.
6
20
3
0
4
4
6
2
0
0
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
- 2
-14
- 1
0
- 4
-21
D =
 Di
n
= -4.2
SD 
2
(D

D
)
 i
n 1
 5.67
Chap 9-26
Paired Samples: Solution
 Has the training made a difference in the number of
complaints (at the 0.05 level)?
H0: μD = 0
H1: μD  0
 = .01
D = - 4.2
Reject
/2
/2
-1.66
d.f. = n - 1 = 4
Decision: Do not reject H0
Test Statistic:
D  μD  4.2  0
t

 1.66
SD / n 5.67/ 5
p-value=
Reject
.1730
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
(p-value is not <.05)
Conclusion: There is not
sufficient evidence of a change
in the number of complaints.
Chap 9-27
Related Sample t Test in
PhStat and Excel
 If the raw data is available
 Tools | Data Analysis | t-Test: Paired Two Sample
for Means
 If only summary statistics are available
 No test available
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-28
Confidence Interval,
Paired Samples
Paired
samples
The confidence interval for D
D  tn1
To get t use function
TINV(1-Confidence, df)
df=n-1
SD
n
n
SD 
 (D  D)
i1
2
i
n 1
Or use PhStat | Confidence Intervals | Estimate for the Mean,
sigma unknown - Select the differences as the data
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-29
Two Population Proportions
Population
proportions
Goal: test a hypothesis or form a
confidence interval for the difference
between two population proportions,
p1 – p2
Assumptions:
Independent and randomly drawn samples
n1p1  5 , n1(1-p1)  5
n2p2  5 , n2(1-p2)  5
The point estimate for
the difference is
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
ps1  ps2
Chap 9-30
Hypothesis Tests for
Two Population Proportions
Population proportions
Lower tail test:
Upper tail test:
Two-tailed test:
H0: p1  p2
HA: p1 < p2
H0: p1 ≤ p2
HA: p1 > p2
H0: p1 = p2
HA: p1 ≠ p2
i.e.,
i.e.,
i.e.,
H0: p1 – p2  0
HA: p1 – p2 < 0
H0: p1 – p2 ≤ 0
HA: p1 – p2 > 0
H0: p1 – p2 = 0
HA: p1 – p2 ≠ 0
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-31
Two Population Proportions
Population
proportions
Since we begin by assuming the null
hypothesis is true, we assume p1 = p2
and pool the two ps estimates
The pooled estimate for the
overall proportion is:
X1  X2
p
n1  n2
where X1 and X2 are the numbers from
samples 1 and 2 with the characteristic
of interest
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-32
Two Population Proportions
(continued)
The test statistic for
p1 – p2 is a Z statistic:
Population
proportions

p
Z
where
p
s1

 p s2   p1  p 2 
1 1
p (1  p)   
 n1 n2 
X1  X2
X
X
, p s1  1 , p s2  2
n1  n2
n1
n2
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-33
Example:
Two population Proportions
Is there a significant difference between the
proportion of men and the proportion of
women who will vote Yes on Proposition A?
 In a random sample, 36 of 72 men and 31 of
50 women indicated they would vote Yes
 Test for a significant difference at the .05 level
of significance
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-34
Check the Assumptions
The sample proportions are:
 Men:
ps1 = 36/72 = .50
 Women: ps2 = 31/50 = .62
Assumptions
Men np=72(.5)=36 n(1-p)=72(.5)=36
Women np=50(.62)=31 n(1-p)=19
All values > 5
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-35
Example:
Two population Proportions
(continued)
 The hypothesis test is:
H0: p1 = p2 (the two proportions are equal)
H1: p1  p2 (there is a significant difference between proportions)
 The sample proportions are:
 Men:
ps1 = 36/72 = .50
 Women:
ps2 = 31/50 = .62
 The pooled estimate for the overall proportion is:
X1  X 2 36  31 67
p


 .549
n1  n2
72  50 122
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-36
Example:
Two population Proportions
(continued)
The test statistic for p1 = p2 is:
z

p
s1

 p s2   p1  p 2 
1
1
p (1  p)   
 n1 n2 
 .50  .62    0
1 
 1
.549 (1  .549) 


72
50


Reject H0
Reject H0
.025
.025
-1.96
  1.31
p-value = .1902
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
1.96
Decision: Do not reject H0
Conclusion: There is not
significant evidence of a
difference in the proportions
of men and women who will
vote yes on Proposition A.
Chap 9-37
Confidence Interval for
Two Population Proportions
The confidence interval for
p1 – p2 is:
Population
proportions
p
s1

 p s2  Z
p s1 (1  p s1 )
n1

p s2 (1  p s2 )
n2
To get Z use function NORMSINV(two tail)
where two tail=Confidence+(1-Confidence)/2
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-38
Two Sample Tests in PHStat
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-39
Sample PHStat Output
Input
Output
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-40
Sample PHStat Output
(continued)
Input
Output
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-41
Chapter Summary
Compared two independent samples
 Performed F tests for the difference between two population
variances
 Performed t tests for the differences in two means for equal and
unequal population variances
 Formed confidence intervals for the differences between two means
 Compared two related samples (paired samples)
 Performed paired sample t tests for the mean difference
 Formed confidence intervals for the paired difference
 Compared two population proportions
 Performed Z-test for two population proportions
 Formed confidence intervals for the difference between two population
proportions
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc.
Chap 9-42