Download Example: Making an inference about m 1

Document related concepts

Foundations of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Analysis of variance wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Inference About One
Population
1
1. Introduction
• In this chapter we utilize the approach developed
before to describe a population.
– Identify the parameter to be estimated or tested.
– Specify the parameter’s estimator and its sampling
distribution.
– Construct a confidence interval estimator or perform
a hypothesis test.
2
1. Introduction
• We shall develop techniques to estimate and
test three population parameters.
– Population mean m
– Population variance s2
– Population proportion p
3
本章我們介紹
1. 母體平均數的假設檢定
2. 母體變異數的假設檢定
3.母體比例的假設檢定
4
Problem objective?
Describe a population
Compare two populations
Data type?
Data type?
Nominal
Interval
Z test &
estimator of p
Type of descriptive
measurement?
Central location
Variability
Type of descriptive
measurement?
Central location
t- test &
c2- test &
estimator of m estimator of s2
Continue
Nominal
Interval
Z test &
estimator of p1-p2
Variability
F- test &
estimator of s12/s225
Experimental design?
Continue
Experimental design?
Continue
Matched
pairs
Continue
Independent
samples
t- test &
estimator of mD
Population variances?
Equal
Unequal
Problem objective?
t- test &
estimator of m1-m2
(Equal variances)
t- test &
estimator of m1-m2
(Unequal variances)
Describe a population
Compare two populations
Data type?
Data type?
Nominal
Interval
test &&
ZZ test
estimator of
of pp
estimator
Type of descriptive
measurement?
Central location
Variability
Type of descriptive
measurement?
Central location
t-t- test
cc22-- test
test &&
test &&
estimator
estimator of
of mm estimator
estimator of
of ss22
Continue
Nominal
Interval
test &&
ZZ test
estimator of
of pp11-p
-p22
estimator
Variability
F- test
test6&&
Festimator of
of ss1122/s
/s22223
estimator
Experimental design?
Continue
Compare two or more populations
Experimental
design?
Independent samples
Population
distribution
Normal
Nonnormal
Kruskal-Wallis
test
ANOVA one-way
or
Two-Factor
Data type?
Interval
Blocks
c2 - test of a
Ordinal
Population
distribution
Normal
Nominal
contingency table
Experimental
design?
Nonnormal
Friedman
test
ANOVA
(randomized blocks)
Independent
samples
Kruskal-Wallis
test
Blocks
Friedman
test
7
2. Inference About a Population Mean
When the Population Standard Deviation
Is Unknown
Recall that when s is known we use the following
statistic to estimate and test a population mean
z
xm
s
n
When s is unknown, we use its point estimator s,
and the z-statistic is replaced then by the t-statistic
8
The t - Statistic
ZZt t t
Z
ttt
Z
x

m
xm
Z
t
t t t t
Z
ss n
s
s
s
sss n
s
ssss sssssss
When the sampled population is normally distributed,
the t statistic is Student t distributed.
9
The t - Statistic
t
The t distribution is mound-shaped,
and symmetrical around zero.
d.f. = v2
v1 < v2
d.f. = v1
0
xm
s
n
The “degrees of freedom”,
(a function of the sample size)
determine how spread the
distribution is (compared to the
normal distribution)
10
Testing m when s is unknown
• Example 1. - Productivity of newly hired
Trainees
11
Testing m when s is unknown
• Example 1.
– In order to determine the number of workers required
to meet demand, the productivity of newly hired
trainees is studied.
– It is believed that trainees can process and distribute
more than 450 packages per hour within one week of
hiring.
– Can we conclude that this belief is correct, based on
12
productivity observation of 50 trainees .
Testing m when s is unknown
• Example 1. – Solution
– The problem objective is to describe the population
of the number of packages processed in one hour.
– The data are interval.
H0:m = 450
H1:m > 450
– The t statistic
t
x m
s
n
d.f. = n - 1 = 49
13
Testing m when s is unknown
• Solution continued (solving by hand)
– The rejection region is
t > ta,n – 1
ta,n - 1 = t.05,49
@ t.05,50 = 1.676.
From the data we have
 x i  23,019
2
x
 i  10,671,357, thus
23,019
x
 460.38, and
50

x

x 

n

2
s2
2
i
i
n 1
s  1507 .55  38.83
 1507 .55.
14
Testing m when s is unknown
Rejection region
• The test statistic is
t
1.676
x m
s
n

460.38  450
38.83
50
1.89
 1.89
• Since 1.89 > 1.676 we reject the null hypothesis in favor
of the alternative.
• There is sufficient evidence to infer that the mean
productivity of trainees one week after being hired is
greater than 450 packages at .05 significance level.
15
Testing m when s is unknown
• Since .0323 < .05, we reject the
null hypothesis in favor of the
alternative.
.05
• There is sufficient evidence to
infer that the mean productivity of
trainees one week after being
hired is greater than 450
packages at .05 significance
level.
.0323
16
Estimating m when s is unknown
• Confidence interval estimator of m when s is
unknown
x  ta
s
2
n
d.f .  n  1
17
t Distribution
t distribution
(20 degrees
of freedom)
Standard
normal
distribution
t distribution
(10 degrees
of freedom)
z, t
0
18
t Distribution Table
• For Areas in the Upper Tail
19
t Distribution
For more than 100 degrees of freedom, the standard
normal z value provides a good approximation to
the t value.
The standard normal z values can be found in the
infinite degrees (  ) row of the t distribution table.
20
t Distribution
Degrees
Area in Upper Tail
of Freedom
.20
.10
.05
.025
.01
.005
.
.
.
.
.
.
.
50
.849
1.299
1.676
2.009
2.403
2.678
60
.848
1.296
1.671
2.000
2.390
2.660
80
.846
1.292
1.664
1.990
2.374
2.639
100
.845
1.290
1.660
1.984
2.364
2.626
.842
1.282
1.645
1.960
2.326
2.576

Standard normal
z values
21
Interval Estimation of a Population Mean:
s Unknown
• Interval Estimate
x  ta / 2
s
n
where: 1 -a = the confidence coefficient
ta/2 = the t value providing an area of a/2
in the upper tail of a t distribution
with n - 1 degrees of freedom
s = the sample standard deviation
22
Interval Estimation of a Population Mean:
s Unknown
At 95% confidence, a = .05, and a/2 = .025.
t.025 is based on n - 1 = 50- 1 = 49 degrees of freedom.
In the t distribution table we see that t.025 = 2.009.
x  ta
2, n 1
s
38.83
@ 460.38  2.009
 449.35, 471.41
n
50
Degrees
Area in Upper Tail
of Freedom
.20
.10
.05
.025
.01
.005
.
.
.
.
.
.
.
50
.849
1.299
1.676
2.009
2.403
2.678
60
.848
1.296
1.671
2.000
2.390
2.660
80
.846
1.292
1.664
1.990
2.374
2.639
100
.845
1.290
1.660
1.984
2.364
2.626
.842
1.282
1.645
1.960
2.326
2.576
23
Summary of Interval Estimation Procedures
for a Population Mean
Can the
population standard
deviation s be assumed
known ?
Yes
Use the sample
standard deviation
s to estimate s
s Known
Case
Use
x  za /2
s
n
No
s Unknown
Case
Use
x  ta /2
s
n
24
Estimating m when s is unknown
• Example 2.
– An investor is trying to estimate the return on
investment in companies that won quality awards
last year.
– A random sample of 83 such companies is selected,
and the return on investment is calculated had he
invested in them.
– Construct a 95% confidence interval for the mean
return.
25
Estimating m when s is unknown
• Solution (solving by hand)
– The problem objective is to describe the population
of annual returns from buying shares of quality
award-winners.
– The data are interval.
x  15 .02 s 2  68 .98
s  68 .98  8.31
– Solving by hand
• From the data we determine
x  ta
2, n 1
s
8.31
@ 15.02  1.990
 13.20,16.83
n
83
t.025,82@ t.025,80
26
Checking the required conditions
• We need to check that the population is normally
distributed, or at least not extremely nonnormal.
• There are statistical methods to test for normality
(one to be introduced later in the book).
• From the sample histograms we see…
27
A Histogram for Example1
14
12
10
8
6
4
2
0
400
425
450
475
500
525
550
Packages
A Histogram for Example 2
30
575
More
25
20
15
10
5
0
-4
2
8
14
Returns
22
30
More
28
3. Inference About a Population Variance
• Sometimes we are interested in making inference
about the variability of processes.
• Examples:
– The consistency of a production process for quality
control purposes.
– Investors use variance as a measure of risk.
• To draw inference about variability, the parameter
of interest is s2.
29
3. Inference About a Population Variance
• The sample variance s2 is an unbiased, consistent and
efficient point estimator for s2.
(n  1)s 2
• The statistic
has a distribution called Chi2
s
squared, if the population is normally distributed.
d.f. = 5
c2 
(n  1)s 2
s
2
d.f .  n  1
d.f. = 10
30
Testing and Estimating a Population
Variance
• From the following probability statement
P(c21-a/2 < c2 < c2a/2) = 1-a
we have (by substituting c2 = [(n - 1)s2]/s2.)
(n  1)s 2
c 2a / 2
 s2 
(n  1)s 2
c12a / 2
31
Chi-Square Distribution
• A Chi-Square Distribution with 19 Degrees of
Freedom
32
Chi-Square Distribution
• Selected Values form the Chi-Square
Distribution Table
33
Chi-Square Distribution
34
Interval Estimation of s2
2
c.975

(n  1)s 2
.025
.025
95% of the
possible c2 values
0
2
c .975
s2
2
 c .025
2
c .025
c2
35
Interval Estimation of s2
• There is a (1 – a) probability of obtaining a c2
value such that
c (12 a / 2)  c 2  ca2 / 2
• Substituting (n – 1)s2/s2 for the c2 we get
c
2
(1a / 2)

(n  1) s 2
s2
 ca2 / 2
• Performing algebraic manipulation we get
(n  1) s 2
c a2 / 2
s2 
(n  1) s 2
c (21a / 2)
36
Interval Estimation of s2
• Interval Estimate of a Population Variance
( n  1) s 2
c a2 / 2
 s2 
( n  1) s 2
c 2(1 a / 2)
where the c2 values are based on a chi-square
distribution with n - 1 degrees of freedom and
where 1 - a is the confidence coefficient.
37
Interval Estimation of s
• Interval Estimate of a Population Standard
Deviation
Taking the square root of the upper and lower
limits of the variance interval provides the confidence
interval for the population standard deviation.
(n  1) s 2
(n  1) s 2
s 
2
ca / 2
c (12 a / 2)
38
Testing the Population Variance
• Example 3. (operation management application)
– A container-filling machine is believed to fill 1 liter
containers so consistently, that the variance of the
filling will be less than 1 cc (.001 liter).
– To test this belief a random sample of 25 1-liter fills
was taken, and the results recorded.
– Do these data support the belief that the variance is
less than 1cc at 5% significance level?
39
Testing the Population Variance
• Solution
– The problem objective is to describe the population of 1-liter fills
from a filling machine.
– The data are interval, and we are interested in the variability of
the fills.
– The complete test is:
H0: s2 = 1
2
2
H1: s <1
(n  1)s
2
The test statistic is c 
The rejection
region
.
s
is c 2  c12a ,n1
2
40
Testing the Population Variance
• Solving by hand
– Note that (n - 1)s2 = S(xi - x)2 = Sxi2 – (Sxi)2/n
– From the sample we can calculate Sxi = 24,996.4,
and Sxi2 = 24,992,821.3
– Then (n - 1)s2 = 24,992,821.3-(24,996.4)2/25 =20.78
2
(
n

1
)
s
20.78
2
c 
 2  20.78,
2
s
1
c12a ,n1  c.295,251  13.8484.
There is insufficient evidence
to reject the hypothesis that
the variance is less than 1.
Since 13.8484  20.78, do not reject
the null hypothesis.
41
Testing the Population Variance
a = .05
1-a = .95
Rejection
region
c 2  13.8484
13.8484 20.8
c2
c.295,251
Do not reject the null hypothesis
42
Estimating the Population Variance
• Example 3.1.
– Estimate the variance of fills in Example 3 data with
99% confidence.
• Solution
– We have (n-1)s2 = 20.78.
From the Chi-squared table we have
c2a/2,n-1 = c2.005, 24 = 45.5585
c21a/2,n-1 c2.995, 24 = 9.88623
43
Estimating the Population Variance
• The confidence interval estimate is
(n  1)s
(n  1)s
2
s  2
2
ca / 2
c1a / 2
2
2
20.78
20.78
2
s 
45.5585
9.88623
.46  s  2.10
2
44
4. Inference About a Population
Proportion
• When the population consists of nominal data, the
only inference we can make is about the
proportion of occurrence of a certain value.
• The parameter p was used before to calculate
these probabilities under the binomial distribution.
45
4. Inference About a Population
Proportion
• Statistic and sampling distribution
– the statistic used when making inference about p is:
x
pˆ 
where
n
x  the number
of
successes.
n  sample size.
– Under certain conditions, [np > 5 and n(1-p) > 5],
p̂ is approximately normally distributed, with
m = p and s2 = p(1 - p)/n.
46
Testing and Estimating the Proportion
• Test statistic for p
p̂  p
Z
p(1  p) / n
where np  5 and n(1  p)  5
• Interval estimator for p (1-a confidence level)
p̂  z a / 2 p̂(1  p̂) / n
provided np̂  5 and n(1  p̂)  5
47
Testing the Proportion
• Example 4. (Predicting the winner in election day)
– Voters are asked by a certain network to participate in an
exit poll in order to predict the winner on election day.
– Based on the data presented where 1=Democrat, and
2=Republican, can the network conclude that the
republican candidate will win the state college vote?
48
Testing the Proportion
• Solution
– The problem objective is to describe the population
of votes in the state.
– The data are nominal.
– The parameter to be tested is ‘p’.
– Success is defined as “Vote republican”.
– The hypotheses are:
H0: p = .5
H1: p > .5
More than 50% vote Republican
49
Testing the Proportion
– Solving by hand
• The rejection region is z > za = z.05 = 1.645.
• From file we count 407 success. Number of voters
participating is 765.
• The sample proportion is p̂  407 765  .532
• The value of the test statistic is
Z
p̂  p
p(1  p) / n

.532  .5
.5(1  .5) / 765
 1.77
• The p-value is = P(Z>1.77) = .038
50
Testing the Proportion
There is sufficient evidence to reject the null hypothesis
in favor of the alternative hypothesis. At 5% significance
level we can conclude that more than 50% voted Republican.
51
Selecting the Sample Size to Estimate
the Proportion
• Recall: The confidence interval for the proportion is
pˆ  za / 2 pˆ (1  pˆ ) / n
• Thus, to estimate the proportion to within W, we can
write
W  za / 2 pˆ (1  pˆ ) / n
52
Selecting the Sample Size to Estimate
the Proportion
• The required sample size is
 za / 2 pˆ (1  pˆ )
n
W




2
53
Sample Size to Estimate the Proportion
• Example
– Suppose we want to estimate the proportion of customers
who prefer our company’s brand to within .03 with 95%
confidence.
 1.96 p̂(1  p̂)
– Find the sample size.
n
– Solution
.03

W = .03; 1 - a = .95,
therefore a/2 = .025,
so z.025 = 1.96
Since the sample has not yet
been taken, the sample proportion
is still unknown.
We proceed using either one of the
following two methods:
54



2
Sample Size to Estimate the Proportion
• Method 1:
– There is no knowledge about the value of p̂
• Let p̂  .5 . This results in the largest possible n needed for a
1-a confidence interval of the form p̂  .03 .
• If the sample proportion does not equal .5, the actual W will be
narrower than .03 with the n obtained by the formula below.
• Method 2:
– There is some idea about the value of p̂
• Use the value of p̂ to calculate the sample size
 1.96 .5(1  .5)
n
.03

2

  1,068
 1.96 .2(1  .2)

n
.03

2

  683

55
Inference about
Two Populations
56
Problem objective?
Describe a population
Compare two populations
Data type?
Data type?
Nominal
Interval
Z test &
estimator of p
Type of descriptive
measurement?
Central location
Variability
Type of descriptive
measurement?
Central location
t- test &
c2- test &
estimator of m estimator of s2
Continue
Nominal
Interval
Z test &
estimator of p1-p2
Variability
F- test &
2
estimator of s12/s257
Experimental design?
Continue
Experimental design?
Continue
Matched
pairs
Continue
Independent
samples
t- test &
estimator of mD
Population variances?
Equal
Unequal
Problem objective?
t- test &
estimator of m1-m2
(Equal variances)
t- test &
estimator of m1-m2
(Unequal variances)
Describe a population
Compare two populations
Data type?
Data type?
Nominal
Interval
test &&
ZZ test
estimator of
of pp
estimator
Type of descriptive
measurement?
Central location
Variability
Type of descriptive
measurement?
Central location
t-t- test
cc22-- test
test &&
test &&
estimator
estimator of
of mm estimator
estimator of
of ss22
Continue
Nominal
Interval
test &&
ZZ test
estimator of
of pp11-p
-p22
estimator
Variability
F- test
test
F58&& 22 22
estimator of
of ss11 /s
/s22 3
estimator
Experimental design?
Continue
1. Introduction
• Variety of techniques are presented whose
objective is to compare two populations.
• We are interested in:
– The difference between two means.
– The ratio of two variances.
– The difference between two proportions.
59
2.
Inference about the Difference
between Two Means: Independent
Samples
• Two random samples are drawn from the two
populations of interest.
• Because we compare two population means, we
use the statistic x1  x 2.
60
The Sampling Distribution of x1  x 2
1.
2.
x1  x 2 is normally distributed if the (original)
population distributions are normal .
x1  x 2 is approximately normally distributed if
the (original) population is not normal, but the
samples’ size is sufficiently large (greater than 30).
3.
The expected value of
4.
The variance of
x1  x 2 is m1 - m2
x1  x 2 is s12/n1 + s22/n2
61
Making an inference about m1 – m2
• If the sampling distribution of x1  x 2 is normal or
approximately normal we can write:
( x 1  x 2 )  (m1  m 2 )
Z
2
2
s1 s 2

n1 n2
• Z can be used to build a test statistic or a
confidence interval for m1 - m2
62
Making an inference about m1 – m2
• Practically, the “Z” statistic is hardly used,
because the population variances are not known.
( x 1  x 2 )  (m1  m 2 )
Zt 
2
2
2
2
s
s
?
S
?
S11
22

n1 n2
• Instead, we construct a t statistic using the
sample “variances” (S12 and S22).
63
Making an inference about m1 – m2
• Two cases are considered when producing the
t-statistic.
– The two unknown population variances are equal.
– The two unknown population variances are not equal.
64
Inference about m1 – m2: Equal variances
• Calculate the pooled variance estimate by:
2
2
(
n

1
)
s

(
n

1
)
s
1
2
2
S p2  1
n1  n2  2
The pooled
variance
estimator
n1 = 10
S
n2 = 15
S 22
2
1
Example: s12 = 25; s22 = 30; n1 = 10; n2 = 15. Then,
(10  1)( 25)  (15  1)( 30)
Sp 
 28.04347
10  15  2
2
65
Inference about m1 – m2: Equal variances
• Calculate the pooled variance estimate by:
2
2
(
n

1
)
s

(
n

1
)
s
1
2
2
S p2  1
n1  n2  2
The pooled
Variance
estimator
n2 = 15
n1 = 10
S
2
1
S 22
S p2
Example: s12 = 25; s22 = 30; n1 = 10; n2 = 15. Then,
(10  1)( 25)  (15  1)( 30)
Sp 
 28.04347
10  15  2
2
66
Inference about m1 – m2: Equal variances
• Construct the t-statistic as follows:
( x1  x 2 )  (m1  m 2 )
t
1
2 1
sp (  )
n1 n2
d.f .  n1  n2  2
• Perform a hypothesis test
H0: m1  m2 = 0
H1: m1  m2 > 0
or < 0
or
0
Build a confidence interval
( x1  x 2 )  t a 2
1 1
sp (  )
n1 n2
2
where 1  ais the confidence level.
67
Inference about m1 – m2: Unequal variances
t
( x1  x2 )  ( m1  m 2 )
d.f. 
s12 s 22
(  )
n1 n2
( s12 n1  s 22 / n2 ) 2
( s12
2
n1 )

n1  1
( s 22
n2 )
n2  1
2
68
Inference about m1 – m2: Unequal variances
Conduct a hypothesis test
as needed, or,
build a confidence interval
Confidence interval
s12 s22
( x1  x2 )  ta 2 (

)
n1 n2
where 1  a is the confidence level
69
Example: Making an inference about m1 – m2
• Example 1.
– Do people who eat high-fiber cereal for
breakfast consume, on average, fewer
calories for lunch than people who do not eat
high-fiber cereal for breakfast?
– A sample of 150 people was randomly drawn.
Each person was identified as a consumer or
a non-consumer of high-fiber cereal.
– For each person the number of calories
consumed at lunch was recorded.
70
Example: Making an inference about m1 – m2
Consmers Non-cmrs
568
498
589
681
540
646
636
739
539
596
607
529
637
617
633
555
.
.
.
.
705
819
706
509
613
582
601
608
787
573
428
754
741
628
537
748
.
.
.
.
Solution:
• The data are interval.
• The parameter to be tested is
the difference between two means.
• The claim to be tested is:
The mean caloric intake of consumers (m1)
is less than that of non-consumers (m2).
71
Example: Making an inference about m1 – m2
•
The hypotheses are:
H0: (m1 - m2) = 0
H1: (m1 - m2) < 0
– To check the whether the population variances are
equal, we use computer output to find the sample
variances
We have s12= 4103, and s22 = 10,670.
– It appears that the variances are unequal.
72
Example: Making an inference about m1 – m2
• Compute: Manually
– From the data we have:
x1  604 .02,

x2  633 .23
s12  4,103 , s22  10,670
(4103 43  10670 107 ) 2
4103

43 
10670 107 

43  1
107  1
2
2
 122 .6 @ 123
73
Example: Making an inference about m1 – m2
• Compute: Manually
– The rejection region is t < -ta, = -t.05,123 @1.658
t
( x1  x2 )  ( m1  m2 )
s12 s22

n1 n2
(604.02  633.23)  (0)

 -2.091
4103 10670

43
107
74
Example: Making an inference about m1 – m2
At the 5% significance level there is sufficient evidence to
reject the null hypothesis.
75
Example: Making an inference about m1 – m2
• Compute: Manually
The confidence interval estimator for the difference
between two means is
 s2 s2 
 1  2
(x  x )  t
1 2
a 2  n
n 
2
 1
4103 10670
 (604.02  633.239)  1.9794

43
107
 29.21  27.65   56.86, 1.56
76
Example: Making an inference about m1 – m2
• Example 2.
– An ergonomic chair can be assembled using two
different sets of operations (Method A and Method B)
– The operations manager would like to know whether
the assembly time under the two methods differ.
77
Example: Making an inference about m1 – m2
• Example 2.
– Two samples are randomly and independently selected
• A sample of 25 workers assembled the chair using method A.
• A sample of 25 workers assembled the chair using method B.
• The assembly times were recorded
– Do the assembly times of the two methods differs?
78
Example: Making an inference about m1 – m2
Assembly times in Minutes
Method A Method B
6.8
5.2
Solution
5.0
6.7
7.9
5.7
5.2
6.6
• The data are interval.
7.6
8.5
5.0
6.5
• The parameter of interest is the difference
5.9
5.9
5.2
6.7
between two population means.
6.5
6.6
.
.
.
.
• The claim to be tested is whether a difference
.
.
between the two methods exists.
.
.
79
Example: Making an inference about m1 – m2
•
Compute: Manually
–The hypotheses test is:
H0: (m1 - m2)  0
H1: (m1 - m2)  0
– To check whether the two unknown population variances are
equal we calculate S12 and S22 .
– We have s12= 0.8478, and s22 =1.3031.
– The two population variances appear to be equal.
80
Example: Making an inference about m1 – m2
•
Compute: Manually
– To calculate the t-statistic we have:
x1  6.288 x2  6.016 s12  0.8478 s22  1.3031
(25  1)( 0.848)  (25  1)(1.303)
S 
 1.076
25  25  2
2
p
t
(6.288  6.016)  0
1 
 1
1.076 


25
25


d . f .  25  25  2  48
 0.927
81
Example: Making an inference about m1 – m2
• The rejection region is
t < -ta/2, =-t.025,48 = -2.009 or
t > ta/2, = t.025,48 = 2.009
For a = 0.05
• The test: Since t= -2.009 < 0.927 < 2.009, there is
insufficient evidence to reject the null hypothesis.
Rejection region
Rejection region
-2.009
.093 2.009
82
Example: Making an inference about m1 – m2
83
Example: Making an inference about m1 – m2
• Conclusion: There is no evidence to infer at the
5% significance level that the two assembly
methods are different in terms of assembly time
84
Example: Making an inference about m1 – m2
A 95% confidence interval for m1 - m2 is calculated as follows:
( x1  x2 )  ta
2
1 1
s (  )
n1 n2
2
p
1
1
 6.288  6.016  2.0106 1.075(  )
25 25
 0.272  0.5897  [0.3177, 0.8617]
Thus, at 95% confidence level -0.3177 < m1 - m2 < 0.8617
Notice: “Zero” is included in the confidence interval
85
Checking the required Conditions for the equal
variances case (Example 2.)
Design A
12
10
The data appear to be
approximately normal
8
6
4
2
0
5
5.8
6.6
Design B
7.4
8.2
More
4.2
5
5.8
7
6
5
4
3
2
1
0
6.6
7.4
More
86
4. Matched Pairs Experiment
• What is a matched pair experiment?
• Why matched pairs experiments are needed?
• How do we deal with data produced in this way?
The following example demonstrates a situation
where a matched pair experiment is the correct
approach to testing the difference between two
population means.
87
4. Matched Pairs Experiment
Example 3.
– To investigate the job offers obtained by MBA graduates, a
study focusing on salaries was conducted.
– Particularly, the salaries offered to finance majors were
compared to those offered to marketing majors.
– Two random samples of 25 graduates in each discipline were
selected, and the highest salary offer was recorded for each
one.
– Can we infer that finance majors obtain higher salary offers
than do marketing majors among MBAs?.
88
4. Matched Pairs Experiment
• Solution
– Compare two populations of
interval data.
– The parameter tested is
m1 - m2
– H0: (m1 - m2) = 0
H1: (m1 - m2) > 0
Finance
61,228
51,836
20,620
73,356
84,186
.
.
.
Marketing
73,361
36,956
63,627
71,069
40,203
.
.
.
m1 The mean of the highest salary
offered to Finance MBAs
m2 The mean of the highest salary
offered to Marketing MBAs
89
4. Matched Pairs Experiment
• Solution – continued
There is insufficient evidence to conclude that Finance MBAs are offered
higher salaries than marketing MBAs.
90
The effect of a large sample variability
• Question
– The difference between the sample means is
65624 – 60423 = 5,201.
– So, why could we not reject H0 and favor H1 where
(m1 – m2 > 0)?
91
The effect of a large sample variability
• Answer:
– Sp2 is large (because the sample variances are
large) Sp2 = 311,330,926.
– A large variance reduces the value of the t statistic
and it becomes more difficult to reject H0.
( x1  x 2 )  (m1  m 2 )
t
1
2 1
sp (  )
n1 n2
92
Reducing the variability
The range of observations
sample A
The values each sample consists of might markedly vary...
The range of observations
sample B
93
Reducing the variability
Differences
...but the differences between pairs of observations
might be quite close to one another, resulting in a small
The range of the
variability of the differences.
differences
0
94
The matched pairs experiment
• Since the difference of the means is equal to
the mean of the differences we can rewrite the
hypotheses in terms of mD (the mean of the
differences) rather than in terms of m1 – m2.
• This formulation has the benefit of a smaller
variability. Group 1
Group 2
Difference
10
15
12
11
-2
+4
Mean1 =12.5 Mean2 =11.5
Mean1 – Mean2 = 1
Mean Differences = 1
95
The matched pairs experiment
• Example 4.
– It was suspected that salary offers were affected by
students’ GPA, (which caused S12 and S22 to increase).
– To reduce this variability, the following procedure was
used:
• 25 ranges of GPAs were predetermined.
• Students from each major were randomly selected, one from
each GPA range.
• The highest salary offer for each student was recorded.
– From the data presented can we conclude that Finance
majors are offered higher salaries?
96
The matched pairs hypothesis test
• Solution (by hand)
– The parameter tested is mD (=m1 – m2)
Finance Marketing
– The hypotheses:
H0: mD = 0
The rejection region is
H1: mD > 0
t > t.05,25-1 = 1.711
– The t statistic:
t
xD  mD
sD
Degrees of freedom = nD – 1
n
97
The matched pairs hypothesis test
• Solution :From the data we calculate:
GPA Group Finance Marketing Difference
1
95171
89329
5842
2
88009
92705
-4696
3
98089
99205
-1116
4
106322
99003
7319
5
74566
74825
-259
6
87089
77038
10051
7
88664
78272
10392
8
71200
59462
11738
9
69367
51555
17812
10
82618
81591
1027
.
.
.
.
.
.
.
.
.
98
The matched pairs hypothesis test
• Solution
x D  5,065
s D  6,647
– Calculate t
x D  mD
5065  0
t

 3.81
sD n 6647 25
99
The matched pairs hypothesis test
100
The matched pairs hypothesis test
Conclusion:
There is sufficient evidence to infer at 5%
significance level that the Finance MBAs’ highest
salary offer is, on the average, higher than that of
the Marketing MBAs.
101
The matched pairs mean difference
estimation
Confidence Interval Estimator of m D
xD  ta / 2,n 1
s
n
Example 13.5
The 95% confidence interval of the mean difference
6647
in Example 13.4 is 5065  2.064
 5,065  2,744  [2321, 7809]
25
102
Checking the required conditions
for the paired observations case
• The validity of the results depends on the
normality of the differences.
Frequency
Histogram
10
5
0
0
5000
10000
15000
20000
Difference
103
13.5 Inference about the ratio
of two variances
• In this section we draw inference about the ratio
of two population variances.
• This question is interesting because:
– Variances can be used to evaluate the consistency
of processes.
– The relationship between population variances
determines which of the equal-variances or unequalvariances t-test and estimator of the difference
between means should be applied
104
Parameter and Statistic
• Parameter to be tested is s12/s22
• Statistic used is
2
1
2
2
s s
F
s s
2
1
2
2
• Sampling distribution of s12/s22
– The statistic [s12/s12] / [s22/s22] follows the F distribution
with 1 = n1 – 1, and 2 = n2 – 1.
105
Parameter and Statistic
– Our null hypothesis is always
H0: s12 / s22 = 1
S12/s12
– Under this null hypothesis the F statistic F = 2 2
S2 /s2
becomes
s
F
s
2
1
2
2
106
Testing the ratio of two population variances
Example 1.1. (revisiting Example 1.)
Calories intake at lunch
In order to perform a test
regarding average
consumption of calories at
people’s lunch in relation to
the inclusion of high-fiber
cereal in their breakfast, the
variance ratio of two samples
has to be tested first.
Consmers Non-cmrs
568
498
589
681
540
646
636
739
539
596
607
529
637
617
633
555
.
.
.
.
705
819
706
509
613
582
601
608
787
573
428
754
741
628
537
748
.
.
.
.
The hypotheses are:
2
s
H0: 1  1
s 22
s12
1
H1:
2
s2
107
Example: Making an inference about m1 – m2
•
The hypotheses are:
H0: (m1 - m2) = 0
H1: (m1 - m2) < 0
– To check the whether the population variances are
equal, we use computer output to find the sample
variances
We have s12= 4103, and s22 = 10,670.
– It appears that the variances are unequal.
108
F Distribution
• Selected Values From the F Distribution
Table
109
F Distribution Table
110
Testing the ratio of two population variances
• Solving by hand
– The rejection region is
F>Fa/2,1,2 or
F<1/Fa/2,2,1
F  Fa / 2, 1, 2  F.025, 42,106  F.025,40,120  1.61
F
1
Fa / 2, 2, 1

1
F.025,106, 42

1
F.025,120,40
1

 .58
1.72
– The F statistic value is F=S12/S22 = .3845
– Conclusion: Because .3845<.58 we reject the null hypothesis in
favor of the alternative hypothesis, and conclude that there is
sufficient evidence at the 5% significance level that the
111
population variances differ.
Testing the ratio of two population variances
Example 6. (revisiting Example 1.)
In order
to perform aare:
test
The hypotheses
regarding average
2
s
consumption
at
H0: 1 ofcalories
1
2 in relation to
people’s s
lunch
2
the inclusion
2 of high-fiber
s1
cereal
in
breakfast,
the

1
H1: their
2
s
variance ratio
2 of two samples
has to be tested first.
F-Test Two-Sample for Variances
Consumers Nonconsumers
Mean
604
633
Variance
4103
10670
Observations
43
107
df
42
106
F
0.3845
P(F<=f) one-tail
0.0004
F Critical one-tail
0.6371
112
Estimating the Ratio of Two Population
Variances
• From the statistic F = [s12/s12] / [s22/s22] we can
isolate s12/s22 and build the following confidence
interval:
2
2 
 s12 

s
s
1
1
 
 1 Fa / 2, 2,1


2
 s2  F
 s2 
s
2
 2  a / 2,1, 2
 2
where 1  n  1 and  2  n2  1
113
Estimating the Ratio of Two Population Variances
• Example 1.2.
– Determine the 95% confidence interval estimate of the ratio
of the two population variances in Example 1.
– Solution
• We find Fa/2,v1,v2 = F.025,40,120 = 1.61 (approximately)
Fa/2,v2,v1 = F.025,120,40 = 1.72 (approximately)
• LCL = (s12/s22)[1/ Fa/2,v1,v2 ]
= (4102.98/10,669.77)[1/1.61]= .2388
• UCL = (s12/s22)[ Fa/2,v2,v1 ]
= (4102.98/10,669.77)[1.72]= .6614
114
6. Inference about the difference
between two population proportions
• In this section we deal with two populations whose data
are nominal.
• For nominal data we compare the population
proportions of the occurrence of a certain event.
• Examples
– Comparing the effectiveness of new drug versus older one
– Comparing market share before and after advertising
campaign
– Comparing defective rates between two machines
115
Parameter and Statistic
• Parameter
– When the data are nominal, we can only count the
occurrences of a certain event in the two
populations, and calculate proportions.
– The parameter is therefore p1 – p2.
• Statistic
– An unbiased estimator of p1 – p2 is p̂1  p̂ 2 (the
difference between the sample proportions).
116
Sampling Distribution of p̂1  p̂ 2
• Two random samples are drawn from two populations.
• The number of successes in each sample is recorded.
• The sample proportions are computed.
Sample 1
Sample size n1
Number of successes x1
Sample proportion
pˆ 1 
x1
n1
Sample 2
Sample size n2
Number of successes x2
Sample proportion
x2
p̂ 2 
n2
117
Sampling distribution of p̂1  p̂ 2
• The statistic p̂1  p̂ 2 is approximately normally distributed
if n1p1, n1(1 - p1), n2p2, n2(1 - p2) are all greater than or
equal to 5.
• The mean of p̂1  p̂ 2 is p1 - p2.
• The variance of p̂1  p̂ 2 is (p1(1-p1) /n1)+ (p2(1-p2)/n2)
118
The z-statistic
Z
( pˆ 1  pˆ 2 )  ( p1  p 2 )
p1 (1  p1 ) p 2 (1  p 2 )

n1
n2
Because p1 and p 2 are unknown the standard error
must be estimated using the sample proportions.
The method depends on the null hypothesis
119
Testing the p1 – p2
• There are two cases to consider:
Case 1:
H0: p1-p2 =0
Calculate the pooled proportion
Case 2:
H0: p1-p2 =D (D is not equal to 0)
Do not pool the data
x1  x 2
p̂ 
n1  n 2
Then
(p̂1  p̂ 2 )  (p1  p 2 )
Z
1
1
p̂(1  p̂)(  )
n1 n2
x1
p̂1 
n1
Then
Z
x2
p̂ 2 
n2
(p̂1  p̂ 2 )  D
p̂1 (1  p̂1 ) p̂ 2 (1  p̂ 2 )

n1
n2
120
Testing p1 – p2 (Case 1)
• Example 5.
– The marketing manager needs to decide which of
two new packaging designs to adopt, to help
improve sales of his company’s soap.
– A study is performed in two supermarkets:
• Brightly-colored packaging is distributed in supermarket 1.
• Simple packaging is distributed in supermarket 2.
– First design is more expensive, therefore,to be
financially viable it has to outsell the second design.
121
Testing p1 – p2 (Case 1)
• Summary of the experiment results
– Supermarket 1 - 180 purchasers of Johnson Brothers
soap out of a total of 904
– Supermarket 2 - 155 purchasers of Johnson Brothers
soap out of a total of 1,038
– Use 5% significance level and perform a test to find
which type of packaging to use.
122
Testing p1 – p2 (Case 1)
• Solution
– The problem objective is to compare the population
of sales of the two packaging designs.
– The data are nominal (Johnson Brothers or other
soap)
Population 1: purchases at supermarket 1
– The hypotheses are
Population 2: purchases at supermarket 2
H0: p1 - p2 = 0
H1: p1 - p2 > 0
– We identify this application as case 1
123
Testing p1 – p2 (Case 1)
• Compute: Manually
– For a 5% significance level the
rejection region is
z > za = z.05 = 1.645
The sample proportions are
pˆ 1  180 904  .1991 , and pˆ 2  155 1,038  .1493
The pooled proportion is
pˆ  ( x1  x 2 ) (n1  n 2 )  (180  155 ) (904  1,038 )  .1725
The z statistic becomes
( pˆ  pˆ 2 )  ( p1  p 2 )
.1991  .1493
Z 1

 2.90
 1
 1
1 
1 
.1725 (1  .1725 )



pˆ (1  pˆ ) 
124
 904 1,038 
 n1 n 2 
Testing p1 – p2 (Case 1)
Conclusion: There is sufficient evidence to conclude at the 5%
significance level, that brightly-colored design will outsell the
simple design.
125
Testing p1 – p2 (Case 2)
• Example 5.1. (Revisit Example 5.)
– Management needs to decide which of two new
packaging designs to adopt, to help improve sales of a
certain soap.
– A study is performed in two supermarkets:
– For the brightly-colored design to be financially viable it
has to outsell the simple design by at least 3%.
126
Testing p1 – p2 (Case 2)
• Summary of the experiment results
– Supermarket 1 - 180 purchasers of Johnson Brothers’
soap out of a total of 904
– Supermarket 2 - 155 purchasers of Johnson Brothers’
soap out of a total of 1,038
– Use 5% significance level and perform a test to find
which type of packaging to use.
127
Testing p1 – p2 (Case 2)
• Solution
– The hypotheses to test are
H0: p1 - p2 = .03
H1: p1 - p2 > .03
– We identify this application as case 2 (the
hypothesized difference is not equal to zero).
128
Testing p1 – p2 (Case 2)
• Compute: Manually
Z 

( pˆ 1  pˆ 2 )  D
pˆ 1 (1  pˆ 1 ) pˆ 2 (1  pˆ 2 )

n1
n2
 180   155 
 
 .03

 904   1,038 
 1 .15
.1991 (1  .1991 ) .1493 (1  .1493 )

904
1,038
The rejection region is z > za = z.05 = 1.645.
Conclusion: Since 1.15 < 1.645 do not reject the null hypothesis.
There is insufficient evidence to infer that the brightly-colored
design will outsell the simple design by 3% or more.
129
Testing p1 – p2 (Case 2)
z-Test: Two Proportions
Supermark et 1 Supermark et 2
Sample Proportions
0.1991
0.1493
Observations
904
1038
Hypothesized Difference
0.03
z Stat
1.14
P(Z<=z) one tail
0.1261
z Critical one-tail
1.6449
P(Z<=z) two-tail
0.2522
z Critical two-tail
1.96
130
淑女與下午茶
The Lady Tasting Tea
131
CH17 變異數分析
132
一九二O年代的一個夏日午後,一
群大學研究員與他們的女眷及訪客
, 正坐在英國劍橋的戶外餐桌旁,
悠閒地享受著下午茶。有位女士宣
稱,下午茶的調製順序對風味有很
大的影響, 把茶加進牛奶裡, 和把
牛奶加進茶裡,兩者喝起來完全不
同。席間那些有科學頭腦的紳士們
都對這種說法嗤之以鼻,怎麼會不
一樣?
133
這時有個身材瘦小、嘴上留著小
鬍子的紳士很興奮地說:「我們
來檢定這個命題 」 。並立刻著
手準備實驗。他調製很多杯不同
的茶,有些先放茶水在加牛奶,
有些先放牛奶後加茶水, 然後一
杯杯拿給那位主張味道不同的女
士分辨。
134
留著小鬍子的紳士是費雪(Sir R. A. Fisher
)。費雪當時所考慮的問題是,如果只
拿一杯茶給她品嚐,她有百分之五十的
機會猜出這杯茶的調製方法,就算她其
實分不出來,也有同樣大的機會;如果
給她兩杯茶,她還是有機會猜對,事實
上,她如果知道這兩杯茶以不同的方法
調製,她可能一次就全部猜對或全部猜
錯。
135
同樣的,即使她真的可以分辨其中的不同,
她還是有機會弄錯。
可能其中一杯的茶與牛奶沒有充分混合,
又或者在泡茶的時候,茶水的溫度不夠高,
影響了茶的味道。
她可能試了十杯茶,其中九杯都說對了,
只有一杯說錯。
136
這是一個很典型的實驗設計的例子。
需要考慮各種不同可設計出的實驗方法,
來測出那位女士是否能分辨不同的茶。
該如何決定準備多少杯茶, 依照什麼順序
拿給她, 是否該讓她知道試喝的順序,再
依照她的答對與否,計算出各項結果的機
率。
137
R. A. Fisher
An historical note: In David Salsburg's The
Lady Tasting Tea: How Statistics
Revolutionized Science In The 20th Century,
Hugh Smith, a witness on that summer
afternoon in Cambridge, does not recall the
exact number of trials in the experiment, but
does say that the lady in question passed
with flying colors, correctly identifying milk-intea or tea-in-milk every time; as for the cause
of the taste difference, pouring hot tea into
cold milk makes the milk curdle, but not so
pouring cold milk into hot tea.
138
實驗是一種累積知識的工具,但很多人並沒有發現
到這一點意義。一流的科學家可以做出很有價值的
實驗,產生新知識,而二流的科學家只是忙於各種
實驗,蒐集大量數據,但對知識的累積沒有什麼用
處。
雖然科學是從審慎思考、觀測與實驗發展而成的,
但究竟要怎麼做實驗,卻從來沒有人提及,所有的
實驗結果通常也不會公布出來給大家看。
139
究竟要怎麼做實驗設計?
費雪的結論是:科學家應該從潛在實驗
結果的數學模型著手。數學模型是指一組
方程式,其中有些符號代表我們想經由實
驗蒐集到的數據,而其餘的符號則代表實
驗的結果。在考量科學問題時,科學家必
須先從實驗中取得數據,再由這些數字計
算出恰當的結果。
140
費雪指出,在設計這種實驗的時
候,第一步是要建立一組數學方
程式,描述欲蒐集的數據與待估
計結果之間的關係,因此,有用
的實驗必須是能夠提供估計值的
實驗。
141
比如說,農業科學家想知道
某種人工肥料對不同品種的
馬鈴薯生長有什麼影響時,
他所做的實驗要能提供必要
的數據,讓他能夠估計這項
影響。
142
如果我們觀察一塊栽種農作物的農地,會發
現某些區域的土壤比其他區域更肥沃;在某
些角落,農作物長得又高又茂盛,在其他的
角落,相同的農作物則又瘦又稀疏。
其中的原因可能是水的流向、土壤的種類,
有一些不知名的養分存在,當然也可能有某
種因素抑制了野草的生長,甚至還有一些先
前不知道的因素。
143
如果農業科學家想試驗兩種肥料成分
之間的不同,他可以把兩種肥料施加
在同一塊田裡的不同部分,但這麼一
來,不同肥料產生的結果,會與土壤
或排水性等其他因素造成的結果發生
交絡(Confounded),而無法區分;如果
是在同一塊地試驗但選在不同的年度
,則由肥料導致的結果,會與每年的
天氣變化發生交絡。
144
不過,如果在同一年裡,我們在相鄰兩株植
物上施不同的肥料,土壤的差異就會減至最
少。但由於處理的作物,土壤的條件不可能
完全一樣,所以土壤的差異還是存在。
費雪決定以隨機的方式,處理一個區塊裡不
同列的農作物。把農場分成一小區一小區的
,每一區的作物在進一步種植成一列一列的
,然後在每一列以隨機的方式來處理。由於
是隨機處理,因此沒有固定的型態,故土壤
可能的差異就會互相抵銷掉,平均掉。
145
這是一種在精心設計的科學實驗中
,區別不同處理所得到的結果的方
法,費雪稱它為「變異數分析」
(Analysis of Variance),簡稱為
ANOVA。
146
變異數分析最早應用於農業方面的實驗,目前已廣泛的
被應用於各種科學的研究及各種決策:
1.商品陳列在商店十個位置的選擇,
例如:放在某一地點比放在其他地點可以賣得更好。
2.生產線的三種方法a, b, c,不同的方法於生產線上,對
產量是否有影響。
3. A、B、C、D四種口味及甲、乙、丙三種添加物對產品
銷售的影響。
147
上面的研究都是三個或三個以上的母體平均數是否
相同的問題:十個地點(位置),三種方法,四種
口味,三種添加物為已知,或為實驗者(研究者)
所控制,稱為獨立變數(independent variable)或因子
(factor)。上面的方法稱為因子,而三種不同的a, b, c
方法稱為三個處理(treatment),每一個處理視為一個
母體,而實驗中的產物稱為實驗單位(experiment unit)
。
商品平均收入,生產線平均產量,產品平均銷售量
則為實驗者(研究者)所欲觀察的反應變數,稱為
依變數(dependent variable)。
148
變異數分析
本章介紹變異數分析,它是用來檢定三個或三個以
上母體平均數相等的假設。
變異數分析依據因子的數目可區分為一因子變異數
分析及二因子變異數分析。
149
變異數分析是用來檢定三個或三個
以上母體平均數相等的假設,看起來
變異數分析這個名詞似乎並不恰當,
因為我們要檢定的是母體平均數而非
變異數,然而事實上,變異數分析的
檢定過程是根據樣本資料的變異量為
分析基礎的。
150
一因子變異數分析
一個蘋果果汁製造商新推出一種濃縮果汁,
比舊的罐裝果汁有三個優點:
1.方便 2.品質好 3.價錢較便宜。
市場經理不知該用何種特質來廣告此一新的
產品,於是他選擇了三個非常類似的城市,
在每一個城市各用其中一種特質(方便、品
質、價錢)來做廣告,看看平均售出的果汁
是否因廣告的特質不同而有所差異。
151
檢定的步驟如下:
1. 設立兩個假設
H0:u1=u2=u3 (三種特質的廣告效果一樣,
賣出的罐數相同)
HA:ui不全等(賣出的罐數不相同)
那如何來進行此一假設檢定呢?
152
One-Way ANOVA
H 0 : m1  m2  m3
H A : Not all m j are the same
The Null
Hypothesis is
True
m1  m 2  m3
153
One-Way ANOVA
H 0 : m1  m2  m3
H A : Not all m j are the same
m1  m 2  m3
The Null
Hypothesis is
NOT True
m1  m2  m3
154
首先,從三個城市獨立觀察20個星期,記錄每個
星期的銷售量,得到
X1 =577.55
X=653.00
2
X3
=608.65
看起來三種不同廣告的特質所賣出的罐數是不相
同的。而由此結果,我們可以不可以就此下結論
說:「三個特質的廣告效果,所賣出的罐數的不
相同的」?
155
答案是「不行」,因為三個
平均數的不同,可能來自於
抽樣的隨機誤差,亦可能來
自三種不同的廣告效果。
156
2. 選取檢定統計量
各種廣告特質所賣出的瓶數的總差異來
自兩方面:
不同特質的差異(母體間的差異)
同一特質間的差異
157
不同特質的差異(母體間的差異)
即注重方便、品質及價錢的不同母體所購買的瓶
數不相同,此稱為因子的差異或組間差異
(between-treatments variation)。
如果虛無假設為真,則三個母體平均數相同,此
時樣本平均數雖仍有差異,但很小,而組間差異
亦必很小;如果三個母體平均數不相同,此時樣
本平均數的差異較大,因而組間差異亦必較大,
亦即平均數的不同係來自不同的母體。
158
同一特質間的差異
即在一特質間(母體)的不同週的銷售瓶
數的不相同,稱為組內差異(withintreatments variation),或隨機差異,亦即銷售
瓶數的差異來自機遇的結果。
159
30
25
x3  20
20
x 2  15
16
15
14
11
10
9
x3  20
20
19
x 2  15
x1  10
12
10
9
x1  10
7
A small variability within
Treatment
1 Treatment
2 Treatment 3
the samples
makes it easier
to draw a conclusion about the
population means.
1
The sample means are the same as before,
Treatment
1
Treatment
2 Treatment
but the larger
within-sample
variability 3
makes it harder to draw a conclusion160
about the population means.
根據上述母體總差異的分解方法,將
樣本的總差異分解為因子引起的差異
(組間差異)與隨機(組內)差異兩
種:
xij  x  ( x j  x )  ( xij  x j )
組間差異 組內差異
161
若將上式取平方和,則可得下式:
k
nj
k
nj
k
nj
2
2
2
(
x

x
)

(
x

x
)

(
x

x
)
 ij
 j
 ij j
j 1 i 1
j 1 i 1
SST = SSB
j 1 i 1
+
SSE
SST: (Total Sum of Squares)為總變異
SSB: (Sum of Squares for Treatment)為因子所
引起的變異
SSE: (Sum of Squares for Error)為隨機變異
162
若各個母體其平均數與整個全體母體平
均數相等,則因子變異數會等於零,而
隨機變異不受影響;而若各個母體有一
母體平均數與整個全體母體平均數不相
等,則因子引起的變異不會等於零,而
隨機變異仍然相同。變異數分析的方法
即是用樣本資料來比較這兩個變異的大
小,以檢定因子所引起的變異(SSB)是否
夠大到足以拒絕虛無假設。
163
若虛無假設成立,則SSB係來自抽樣誤差,因此,SSB
相對SSE不會太大;若虛無假設不成立,則SSB的數值
相對SSE的數值則將會較大。
另外,SSB及SSE會受樣本個數多寡的影響,因此,不
能直接比較SSB及SSE的大小,而必須進一步求平均變
異,分別為:
MSB=SSB/(k-1)
MSE=SSE/(n-k)
式中,
MSB(Mean Square for Treatment):因子所引起之平均變異
MSE(Mean Square for Error):
隨機平均變異
164
現在我們可以比較MSB與MSE這兩個
變異數,當MSB相對MSE較大時,顯
示因子會影響依變數,因此我們以
MSB/MSE作為檢定統計量來進行假設
檢定。究竟的數值要多大,我們才會
拒絕呢?這就必須先求檢定統計量
F=MSB/MSE的分配為何了!
165
F=MST/MSE的抽樣分配
在進行變異數分析時有下列幾個假設:
1.假設因子對依變數的影響效果是固定的,為一常數
,而不是隨機變數。
2. 假設母體均為常態分配。
3.變異數齊一性(Homogeneity),每個母體的變異數均相
等。
4.抽樣方法為獨立簡單隨機抽樣,即自k個母體分別選
取獨立之隨機樣本。
166
在前述變異數的四個假設下,當H0為真時,
F=MSB/MSE的抽樣分配為一個
自由度為k1及nk的F分佈。
MSB
F
~ F ( k  1, n  k )
MSE
167
One-Way ANOVA
Ho : m  m  m 
1
2
m
3
k
Ha : At least one of the means is different from the others
MSB
F
MSE
If F > Fc, reject Ho.
If F 
F , do not reject H .
c
o
168
3. 決定決策法則
在選定顯著水準a下,決策法則為:
若F> Fa ,k 1,n,則拒絕H
k
0。
若F<
Fa ,k 1,n,則接受H
k
0。
此即表示我們採取右尾檢定,理由是當u=ui時
,E(MSB)=E(MSE),此時的值應在1左右;但當
E(MSB)>E(MSE)時,顯示ui不全等,因此在檢定
時,若MSB/MSE值很大,應拒絕H0,亦即拒絕
域放於右尾。
169
4. 計算及比較檢定統計量。
5.根據決策法則得檢定結果,然後下結論。
170
例子:蘋果果汁的例子
X
X
=653.00
=608.65
u=613.07
X =577.55
SSB=57512.23 SSE=506967.88 k=3 n=60
MSB=SSB/(k-1)=28756.12, MSE=SSE/(n-k)=8894.17
F0.05=3.15
F=MSB/MSE=3.23>
 拒絕
, 2 , 57
1
2
3
由上面的結果,我們可以下結論:「三個城市的
蘋果汁銷售瓶數的不同係來自廣告不同特質的差
異」。
171
ANOVA
Sums of
Squares
Degrees
of
Freedom
Treatments
SSB
k-1
MSB=SSB/(k-1)
Error
SSE
n-k
MSE=SSE/(n-k)
Total
SS(Total)
n-1
Source of
Variation
Mean
Squares
F-Statistics
P-Value
F=MSB/MSE
172
single factor ANOVA
173
在前面蘋果果汁的例子中,除了方便
、品質及價錢三個特質之外,如果在
每一個城市廣告的媒介有二種:報紙
與電視。市場經理也想知道:要在那
一種媒介廣告可能比較有效?那要如
何來進行實驗呢?
174
第一種方法是選擇六個城市,觀察10個星期,記錄每
個星期的銷售量。
1.方便、電視 2.方便、報紙 3.品質、電視
4.品質、報紙 5.價錢、電視 6.價錢、報紙
H0:u1=u2=u3=u4=u5=u6
HA: ui不全等
如果結果是F=2.45> F0.05,5,54,則拒絕。
結論:「在這六個城市,蘋果果汁銷售量不同」。
175
市埸經理如果要利用此結果來進行
他的行銷策略,他該如何做呢?他如
何來辨別要利用報紙或者電視來做推
銷呢?或者如何採何種混合策略能有
較高的銷售量呢?
176
二因子變異數分析
第二種方法是進行二因子變異數分析,假
設二因子(特質、媒介)各有a個及b個處
理(treatments),在每一個混合的處理,有r
個重覆樣本(replicate),稱這樣的實驗為
complete ab factorial experiment,如果每一
個混合處理的重覆樣本都是同樣r個,稱
為balanced設計。
177
如果在這些混合處理的結果,有存在樣本平均
量的不同,那我們想知道,這些的不同到底是
由於因子A或是因子B的影響,或者是同時影響
。如果是同時影響,那是獨立影響或是有交叉
影響呢?
1. 因子A、因子B都有影響,但無交叉影響。
2. 因子A有影響,因子B無影響。
3. 因子B有影響,因子A無影響。
4. 因子A、B有交叉影響。
178
去檢定上述四種可能性,我們需進
行三個F檢定,來決定影響平均數的
不同是來自交叉影響,或者是因子A
,或者是因子B。
179
例子:蘋果果汁的例子
A因子(特質)有方便、品質、價錢三個處理
B因子(媒介)有電視、報紙二個處理
在做二因子變異數分析時,除非在實驗前已
知或假設二因子無交叉影響,否則都要先做交
叉影響的檢定。
180
如果對交叉影響的檢定發現有顯著影響,那就不
須再做分別對因子A及因子B的假設檢定,因為如
果有交叉影響,那就表示在某些因子A的處理及
某些因子B的處理的混合,會造成平均數的不同
,在這種情形下,大部份如果再去做對因子A或
因子B的檢定結果也會是顯著的。但是此顯著結
果可能是錯的,因為對某些因子的處理的平均數
的不同,可能只是因為交叉影響造成的,而不是
因子本身所造成的。所以,如果有交叉影響,則
不須再對其它的兩個A、B因子去做檢定。
181
所以在做二因子變異數分析時,先
對交叉影響做檢定,如果沒有交叉
影響,則再做對因子A及因子B的檢
定,看看因子A、因子B對依變數是
否有影響。
182
Two-Way Factorial Design
Column Treatment
.
.
Row
Treatment
Cells
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
183
Two-Way ANOVA
• Assumptions
– Normality
• Populations are normally distributed
– Homogeneity of Variance
• Populations have equal variances
– Independence of Errors
• Independent random samples are drawn
184
Two-Way ANOVA: Hypotheses
Row Effects:
Ho: Row Means are all equal.
Ha: At least one row mean is different from the others.
Columns Effects:
Ho: Column Means are all equal.
Ha: At least one column mean is different from the others.
Interaction Effects: Ho: The interaction effects are zero.
Ha: There is an interaction effect.
185
Two-Way ANOVA
Total Variation Partitioning
Variation Due to
Factor A
Total Variation
SST
d.f.= N-1
=
Variation Due to
Factor B
Variation Due to
Interaction
Variation Due to
Random Sampling
SSA
d.f.= r-1
SSB
d.f.= c-1
+
+
SSAB +
d.f.= (r-1)(c-1)
SSE
d.f.= rc(n-1)
186
Formulas for Computing
a Two-Way ANOVA
r
SSR  nc  (
i 1
c
SSC  nr  (
j 1
r
X i X )
2
X j X )
c
SSI  n (
i 1 j 1
c
X ij  X i  X j  X )
n
i 1 j 1 k 1
c
r
n
SST   ( X
c 1 r 1 k 1
R
df
C
2
SSE   ( X ijk  X ij )
r
df
2
df
I
 r 1
 c 1
  r  1 c  1
where :
n =number of observations per cell
c =number of column treatments
r = number of row treatments
2
df
E
 rc  n  1
i = row treatment level
j = column treatment level
ijk  X )
2
df
T
 N 1
k = cell member
X
X
X
X
ijk
MSR 
SSR
r 1
SSC
MSC 
c 1
MSI 
SSI
 r  1 c  1
MSE 
SSE
rc  n  1
F
F
F

R
C
I
MSR
MSE
MSC

MSE

MSI
MSE
ij
i
j
= individual observation
= cell mean
= row mean
= column mean
X = grand mean
187
Two-Way ANOVA: The F Test Statistic
H0: m1 ..= m2 .. = ••• = mr .. F Test for Factor A Main Effect
MSA
H1: Not all mi .. are equal F  MSE
SSA
MSA 
r 1
Reject if
F > FU
H0: m1. = m.2. = ••• = mc. F Test for Factor B Main Effect
MSB
H1: Not all m.j. are equal F  MSE
SSB
MSB 
c 1
Reject if
F > FU
H0: ij = 0 (for all i and j) F Test for Interaction Effect
H1: ij  0
MSAB
F
MSE
SSAB
MSAB 
 r  1 c  1
Reject if
F > FU
188
Two-Way ANOVA Summary Table
Source of
Variation
Degrees of
Freedom
Sum of
Squares
Mean
Squares
F
Statistic
r–1
SSA
MSA =
SSA/(r – 1)
MSA/
MSE
c–1
SSB
MSB =
SSB/(c – 1)
MSB/
MSE
AB
(Interaction)
(r – 1)(c – 1)
SSAB
MSAB =
SSAB/ [(r – 1)(c – 1)]
MSAB/
MSE
Error
r c
n’
SSE
MSE =
SSE/[rc n’ – 1)]
Total
r  c  n’ – 1
Factor A
(Row)
Factor B
(Column)
– 1)
SST
189
Difference between the levels of factor A, and Difference between the levels of factor A
difference between the levels of factor B; no
No difference between the levels of factor B
interaction
M R
Level 1 of factor B
Level 1and 2 of factor B
e e
s
a p
Level 2 of factor B
n o
n
s
e
Levels of factor A
Levels of factor A
M R
e e
s
a p
n o
n
s
e
1
M R
e e
s
a p
n o
n
s
e
2
3
1
No difference between the levels of factor A.
Difference between the levels of factor B
M R
e e
s
a p
n o
n
s
e
2
Interaction
Levels of factor A
1
2
3
3
1
2
Levels of factor A
3
190
A 2  3 Factorial Design
with Interaction
Row effects
Cell
Means
R1
R2
C1
C2
Column
C3
191
A 2  3 Factorial Design
with Some Interaction
Row effects
Cell
Means
R1
R2
C1
C2
Column
C3
192
A 2  3 Factorial Design
with No Interaction
Row effects
Cell
Means
R1
R2
C1
C2
C3
Column
193
ANOVA
Source
of
Variation
Degrees
of
Freedom
Sums of
Squares
Mean
Squares
F-Statistics
Factor A
SS(A)
a-1
MS(A)=SS(A)/(a-1)
F=MS(A)/MSE
Factor B
SS(B)
b-1
MS(B)=SS(B)/(b-1)
F=MS(B)/MSE
Interaction
SS(AB)
(a-1)(b-1)
MS(AB)=SS(AB)/(a-1)(b-1)
F=MS(AB)/MSE
Error
SSE
n-ab
Total
SS(Total)
n-1
P-Value
194
F tests for the Two-way ANOVA
• Example - continued
– Test for interaction between factors A and B
H0: mTV*conv. = mTV*quality =…=mnewsp.*price
H1: At least two means differ
Interaction AB = Marketing*Media
195
F tests for the Two-way ANOVA
•
Example - continued
– Test for interaction between factor A and B
H0: mTV*conv. = mTV*quality =…=mnewsp.*price
H1: At least two means differ
F = MS(Marketing*Media)/MSE = .087
MS(AB)/MSE
Fcritical = Fa,a-1)(b-1),n-ab = F.05,(3-1)(2-1),60-(3)(2) = 3.17 (p-value= .917)
– At 5% significance level there is insufficient evidence to infer that the two
factors interact to affect the mean weekly sales.
196
F tests for the Two-way ANOVA
• Example – continued
– Test of the difference in mean sales between the three marketing strategies
H0: mconv. = mquality = mprice
H1: At least two mean sales are different
Factor A Marketing strategies
197
F tests for the Two-way ANOVA
• Example – continued
– Test of the difference in mean sales between the three
marketing strategies
H0: mconv. = mquality = mprice
H1: At least two mean sales are different
MS(A)/MSE
F = MS(Marketing strategy)/MSE = 5.325
Fcritical = Fa,a-1,n-ab = F.05,3-1,60-(3)(2) = 3.17; (p-value = .008)
– At 5% significance level there is evidence to infer that
differences in weekly sales exist among the marketing
strategies.
198
F tests for the Two-way ANOVA
• Example - continued
– Test of the difference in mean sales between the two
advertising media
H0: mTV. = mNespaper
H1: The two mean sales differ
Factor B = Advertising media
199
F tests for the Two-way ANOVA
• Example - continued
– Test of the difference in mean sales between the two
advertising media
H0: mTV. = mNespaper
H1: The two mean sales differ
MS(B)/MSE
F = MS(Media)/MSE = 1.419
Fcritical = Fa,a-1,n-ab = F.05,2-1,60-(3)(2) = 4.02 (p-value = .239)
– At 5% significance level there is insufficient evidence to infer
that differences in weekly sales exist between the two
advertising media.
200
1. 檢定A、B因子是否有交叉影響:
F=0.087 < =3.15= F0.05,2,54 接受H0
結論:「無交叉影響」。
2. 再對因子B做檢定:
F=5.325 > =3.15= F0.05,2,54 拒絕H0
結論:「蘋果汁銷售量的不同來自廣告不同特質的
差異」。
3. 再對因子A做檢定:
F=1.419 <=4.00 = F0.05,1,54 接受H0
結論:「廣告媒介的不同對蘋果汁銷售量並無影響」。
201
CH14 Chi Squared Tests
202
1. Introduction
• Two statistical techniques are presented, to
analyze nominal data.
– A goodness-of-fit test for the multinomial experiment.
– A contingency table test of independence.
• Both tests use the c2 as the sampling distribution
of the test statistic.
203
2. Chi-Squared Goodness-of-Fit Test
• The hypothesis tested involves the probabilities p1, p2, …,
pk.of a multinomial distribution.
• The multinomial experiment is an extension of the binomial
experiment.
– There are n independent trials.
– The outcome of each trial can be classified into one of k
categories, called cells.
– The probability pi that the outcome fall into cell i remains
constant for each trial. Moreover,
p1 + p2 + … +pk = 1.
– Trials of the experiment are independent
204
2. Chi-squared Goodness-of-Fit Test
• We test whether there is sufficient evidence to
reject a pre-specified set of values for pi.
• The hypothesis:
H 0 : p1  a1 , p 2  a 2 ,..., p k  a k
H 1 : At least one p i  a i
• The test builds on comparing actual frequency
and the expected frequency of occurrences in all
the cells.
205
The multinomial goodness of fit test Example
• Example 1.
– Two competing companies A and B have enjoy
dominant position in the market. The companies
conducted aggressive advertising campaigns.
– Market shares before the campaigns were:
• Company A = 45%
• Company B = 40%
• Other competitors = 15%.
206
The multinomial goodness of fit test Example
• Example 1. – continued
– To study the effect of the campaign on the market
shares, a survey was conducted.
– 200 customers were asked to indicate their preference
regarding the product advertised.
– Survey results:
• 102 customers preferred the company A’s product,
• 82 customers preferred the company B’s product,
• 16 customers preferred the competitors product.
207
The multinomial goodness of fit test Example
• Example 1. – continued
Can we conclude at 5% significance level that
the market shares were affected by the
advertising campaigns?
208
The multinomial goodness of fit test Example
• Solution
–
–
–
–
The population investigated is the brand preferences.
The data are nominal (A, B, or other)
This is a multinomial experiment (three categories).
The question of interest: Are p1, p2, and p3 different
after the campaign from their values before the
campaign?
209
The multinomial goodness of fit test Example
• The hypotheses are:
H0: p1 = .45, p2 = .40, p3 = .15
H1: At least one pi changed.
The expected frequency for each
category (cell) if the null hypothesis
is true is shown below:
90 = 200(.45)
80 = 200(.40)
What actual frequencies
did the sample return?
102
82
1
2
1
3
2
30 = 200(.15)
3
16
210
The multinomial goodness of fit test Example
• The statistic is
2
(
f

e
)
i
c2   i
ei
i 1
k
where e i  np i
• The rejection region is
c 2  c 2a ,k 1
211
The multinomial goodness of fit test Example
• Example 1. – continued
k
c2 

i1
(102  90)2 (82  80)2 (16  30)2


 8.18
90
80
30
2
ca2 ,k 1  c.05,3
1  5.99147
The p value  0.01679
212
The multinomial goodness of fit test Example
• Example 1. – continued
c2 with 2 degrees of freedom
0.025
Conclusion: Since 8.18 > 5.99, there is sufficient
evidence at 5% significance level to reject the null
hypothesis. At least one of the probabilities pi is
different. Thus, at least two market shares have
changed.
0.02
0.015
0.01
Alpha
0.005
0
0
2
4
5.99
6
P value
8.18
8
10
Rejection region
12
213
Required conditions –
the rule of five
• The test statistic used to perform the test is only
approximately Chi-squared distributed.
• For the approximation to apply, the expected cell
frequency has to be at least 5 for all the cells
(npi  5).
• If the expected frequency in a cell is less than 5,
combine it with other cells.
214
3. Chi-squared Test of a Contingency Table
• This test is used to test whether…
– two nominal variables are related?
– there are differences between two or more
populations of a nominal variable
• To accomplish the test objectives, we need to
classify the data according to two different
criteria.
215
Contingency table c2 test –
Example
• Example 2.
– In an effort to better predict the demand for courses
offered by a certain MBA program, it was hypothesized
that students’ academic background affect their choice
of MBA major, thus, their courses selection.
– A random sample of last year’s MBA students was
selected. The following contingency table summarizes
relevant data.
216
Contingency table c2 test –
Example
Degree
BA
BENG
BBA
Other
Accounting
31
8
12
10
61
Finance
13
16
10
5
44
Marketing
16
7
17
7
47
60
31
60
39
152
The observed values
217
Contingency table c2 test –
Example
• Solution
–
Since ei = npi but pi is
unknown, we need to
The hypotheses are:
estimate the unknown
H0: The two variables are independent probability from the data,
H1: The two variables are dependent assuming H0 is true.
– The test statistic
k
c 
2

i1
( fi  e i )2
ei
k is the number of cells in
the contingency table.
– The rejection region
c2  c2a,(r 1)( c 1)
218
Estimating the expected frequencies
Undergraduate
Degree
Accounting
BA
BENG
BBA
Other
6161
Probability
61/152
MBA Major
Finance Marketing
44
44
44/152
6060
31
3939
22
47
47/152
Probability
60/152
31/152
39/152
22/152
152
152
Under the null hypothesis the two variables are independent:
P(Accounting and BA) = P(Accounting)*P(BA) = [61/152][60/152].
The number of students expected to fall in the cell “Accounting - BA” is
eAcct-BA = n(pAcct-BA) = 152(61/152)(60/152) = [61*60]/152 = 24.08
The number of students expected to fall in the cell “Finance - BBA” is
eFinance-BBA = npFinance-BBA = 152(44/152)(39/152) = [44*39]/152 = 11.29
219
The expected frequencies for a
contingency table
• The expected frequency of cell of raw i and
column j in the contingency table is calculated by
(Column j total)(Row i total)
eij =
Sample size
220
k
c 
2

i1
( fi  e i )2
ei
Calculation of the c2 statistic
• Solution – continued
Undergraduate
Degree
Accounting
31 (24.08)
24.08
BA
k
BENG
2 8 (12.44)
BBA 31 24.08
12 (15.65)
Other
10 (8.83)
i61
1
31 24.08
c 
31
24.08
31
c2=
24.08

MBA Major
Finance
Marketing
13 (17.37) 2 16 (18.55)
16
(8.97)
7 (9.58)
i
i
10 (11.29) 17 (12.06)
(6.39) 77 6.80
(6.80)
55 6.39
i
44
47
(f  e )
e
5 6.39
The expected frequency
5 6.39
60
31
39
22
152
7 6.80
7 6.80
7 6.80
5 6.39
(31 - 24.08)2
(5 - 6.39)2
(7 - 6.80)2
=
+….+
+….+
24.08
6.39
6.80
14.70
221
Contingency table c2 test –
Example
• Solution – continued
– The critical value in our example is:
c 2a ,( r 1)( c 1)  c.205,( 4 1)( 31)  12.5916
• Conclusion:
Since c2 = 14.702 > 12.5916, there
is sufficient evidence to infer at 5% significance
level that students’ undergraduate degree
and MBA students courses selection
are dependent.
222
223
224
Required condition Rule of five
– The c2 distribution provides an adequate approximation to
the sampling distribution under the condition that eij >= 5 for
all the cells.
– When eij < 5 rows or columns must be added such that the
condition is met.
Example
10 (10.1) 14
18 (12.8)
(17.9)
23 (16.0)
(22.3)
12 (12.7) 16
(12.8)
8 ( 7.2) 12
8 (9.2)
We combine
column 2 and 3
14 + 4
16 + 7
8+4
4 (5.1)
7 (6.3)
4 (3.6)
12.8 + 5.1
16 + 6.3
9.2 + 3.6
225
CH15 迴歸分析
226
• 複迴歸(Multiple Regression,又稱「多元迴歸
」)屬單準則變數的相依方法,其目的在了解
及建立一個連續尺度之準則變數與一組連續尺
度之預測變數間的關係。
• 複迴歸可用下列一般形式來說明:
Y  X1  X 2 
( 連續 )
( 連續 )
 Xm
227
Example
• A distributor of frozen desert pies wants to evaluate
factors thought to influence demand
– Dependent variable:
Pie sales (units per week)
– Independent variables: Price (in $)
Advertising ($100’s)
• Data are collected for 15 weeks
228
Pie Sales Example
Week
Pie
Sales
Price
($)
Advertising
($100s)
1
350
5.50
3.3
2
460
7.50
3.3
3
350
8.00
3.0
4
430
8.00
4.5
5
350
6.80
3.0
6
380
7.50
4.0
7
430
4.50
3.0
8
470
6.40
3.7
9
450
7.00
3.5
10
490
5.00
4.0
11
340
7.20
3.5
12
300
7.90
3.2
13
440
5.90
4.0
14
450
5.00
3.5
15
300
7.00
2.7
229
230
231
利用複迴歸分析,希望可以回答以下的三個問題:
•描述:能否找出一個線性結合,用以簡潔的說明
一組預測變數(X)與一個準則變數(Y)
之間的關係?如果能的話,此種關係的強
度有多大?
•推估:整體關係是否具有統計上的顯著性?在解
釋準則變數的變異方面,哪些預測變數最
為重要?
•預測:利用預測變數的線性結合來預測準則變數
的能力如何?
232
Pie Sales Example
係數a
模式
1
(常數)
price
advertising
未標準化係數
B 之估計值
標準誤
306.526
114.254
標準化係
數
Beta 分配
t
2.683
顯著性
.020
-24.975
10.832
-.461
-2.306
.040
74.131
25.967
.570
2.855
.014
a. 依變數:piesales
233
Example: Programmer Salary Survey
A software firm collected data for a sample of 20
computer programmers. We want to determine if
salary was related to the years of experience and the
score on the firm’s programmer aptitude test.
The years of experience, score on the aptitude test,
and corresponding annual salary ($1,000s) for a
sample of 20 programmers is shown on the next
slide.
234
Example: Programmer Salary Survey
Exper.
Score
Salary
Exper.
Score
Salary
4
7
1
5
8
10
0
1
6
6
78
100
86
82
86
84
75
80
83
91
24
43
23.7
34.3
35.8
38
22.2
23.1
30
33
9
2
10
5
6
8
4
6
3
3
88
73
75
81
74
87
79
94
70
89
38
26.6
36.2
31.6
29
34
30.1
33.9
28.2
30
235
236
利用複迴歸分析,希望可以回答以下的三個問題:
•描述:能否找出一個線性結合,用以簡潔的說明
一組預測變數(X)與一個準則變數(Y)
之間的關係?如果能的話,此種關係的強
度有多大?
•推估:整體關係是否具有統計上的顯著性?在解
釋準則變數的變異方面,哪些預測變數最
為重要?
•預測:利用預測變數的線性結合來預測準則變數
的能力如何?
237
Example: Programmer Salary Survey
係數a
模式
1
(常數)
未標準化係數
B 之估計值
標準誤
3.174
6.156
標準化係
數
Beta 分配
t
.516
顯著性
.613
exper
1.404
.199
.741
7.070
.000
score
.251
.077
.340
3.243
.005
a. 依變數:salary
238
MBA Program Admission Policy
• The dean of a large university wants to raise the
admission standards to the popular MBA program.
• He plans to develop a method that can predict an
applicant’s performance in the program.
• He believes a student’s success can be predicted by:
– Undergraduate GPA
– Graduate Management Admission Test (GMAT) score
– Number of years of work experience
239
MBA Program Admission Policy
• A randomly selected sample of students who completed
the MBA was selected.
MBA GPA UnderGPA
8.43
6.58
8.15
8.88
.
.
10.89
10.38
10.39
10.73
.
.
GMAT
Work
584
483
484
646
.
.
9
7
4
6
.
.
• Develop a plan to decide which applicant to admit.
240
241
利用複迴歸分析,希望可以回答以下的三個問題:
•描述:能否找出一個線性結合,用以簡潔的說明
一組預測變數(X)與一個準則變數(Y)
之間的關係?如果能的話,此種關係的強
度有多大?
•推估:整體關係是否具有統計上的顯著性?在解
釋準則變數的變異方面,哪些預測變數最
為重要?
•預測:利用預測變數的線性結合來預測準則變數
的能力如何?
242
MBA Program Admission Policy
係數a
模式
1
(常數)
未標準化係數
B 之估計值
標準誤
.466
1.506
標準化係
數
Beta 分配
t
.310
顯著性
.758
UnderGPA
.063
.120
.042
.524
.602
GMAT
.011
.001
.650
8.159
.000
Work
.093
.031
.238
2.996
.004
a. 依變數:MBAGPA
MBA GPA = 0.466 + 0.063×UnderGPA + 0.011×GMAT + 0.093×Work
243
Multiple Regression Decision Process
Stage 1:
Stage 2:
Stage 3:
Stage 4:
Stage 5:
Stage 6:
研究問題目的
研究設計
迴歸假定
估計迴歸模式
評估解釋能力
驗證迴歸結果
244
研究問題目的
複迴歸是使用甚廣的一種多變量分析技術,可利用
複迴歸來研究的問題可分為兩大類,即解釋與預測。
這兩類的研究問題並不互相排斥,研究人員可應用
複迴歸技術來單獨分析解釋或預測的研究問題,也
可同時處理這兩類的研究問題。
複迴歸的目的是要建立一個準則變數和一組預測之
間的關係,研究人員首先要決定哪一個變數是準則
變數,哪些變數是預測變數。準則變數的選擇通常
會由研究問題來決定;預測變數的選擇雖然也視研
究問題而定,但最好要有理論上的依據,以免將一
些不相關或不合適的預測變數納入迴歸模式中。
245
La Quinta Motor Inns Example
• Where to locate a new motor inn?
– La Quinta Motor Inns is planning an expansion.
– Management wishes to predict which sites are likely to be
profitable.
– Several areas where predictors of profitability can be identified
are:
• Competition
• Market awareness
• Demand generators
• Demographics
• Physical quality
246
La Quinta Motor Inns Example
Profitability
Competition
Rooms
Number of
hotels/motels
rooms within
3 miles from
the site.
Market
awareness
Nearest
Distance to
the nearest
La Quinta inn.
Customers
Office
space
College
enrollment
Operating Margin
Community
Physical
Income
Disttwn
Median
household
income.
Distance to
downtown.
247
研究設計
Issues to consider:
•
•
Sample size,
Unique elements of the dependence
relationship – can use dummy variables as
independents.
248
Sample Size Considerations
•
Simple regression can be effective with a sample size of
20, but in multiple regression requires a minimum
sample of 50 and preferably 100 observations for most
research situations.
•
The minimum ratio of observations to variables is 5 to 1,
but the preferred ratio is 15 or 20 to 1, and this should
increase when stepwise estimation is used.
•
Maximizing the degrees of freedom improves
generalizability and addresses both model parsimony
and sample size concerns.
249
La Quinta Motor Inns Example
• Data were collected from randomly selected 100 inns that
belong to La Quinta, and ran for the following suggested
model:
Margin
55.5
33.8
49
31.9
57.4
49
Number
3203
2810
2890
3422
2687
3759
Nearest
4.2
2.8
2.4
3.3
0.9
2.9
Office Space
549
496
254
434
678
635
Enrollment
8
17.5
20
15.5
15.5
19
Income
37
35
35
38
42
33
Distance
2.7
14.4
2.6
12.1
6.9
10.8
250
Variable Transformations
• Nonmetric variables can only be included in a
regression analysis by creating dummy variables.
• Dummy variables can only be interpreted in
relation to their reference category.
• Adding an additional polynomial term represents
another inflection point in the curvilinear
relationship.
• Quadratic and cubic polynomials are generally
sufficient to represent most curvilinear
relationships.
檢查迴歸假定
在求得估計的迴歸模式之後,接著要檢查模
式中各準則變數與預測變數以及整個迴歸關
係是否符合複迴歸的假定條件。如發現有嚴
重不符合情事,應採取必要的改正行動並重
新估計迴歸模式。有關複迴歸的四項基本假
定,包括
1. 直線性(linearity)、
2. 變異數相等性(homoscedasticity)、
3. 獨立性(independence)和
4. 常態性(normality)。
252
檢查迴歸假定
複迴歸模式有四種基本的假定,我們建立的複迴
歸必須符合這四項規定,才稱得上是一個有效的、
合適的模式。這四項假定是:
1. 準則變數與預測變數之間的直線關係。
2. 誤差項的變異數相等。
3. 誤差項的獨立性。
4. 誤差項分配的常態性。
為檢視複迴歸模式是否符合上述各項規定,可以
觀察誤差值散佈圖的形狀。
253
Residuals Plots
•
Histogram of standardized residuals – enables you to determine if the errors
are normally distributed.
•
Normal probability plot – enables you to determine if the errors are normally
distributed. It compares the observed (sample) standardized residuals against
the expected standardized residuals from a normal distribution.
•
ScatterPlot of residuals – can be used to test regression assumptions. It
compares the standardized predicted values of the dependent variable against
the standardized residuals from the regression equation. If the plot exhibits a
random pattern then this indicates no identifiable violations of the assumptions
underlying regression analysis.
254
(1)直線關係
準則變數(Y)與預測變數(X)應具有直線關係。此一直
線關係可從誤差值的散佈形狀觀察出來。我們以誤差值
( ei  Yi  Yˆi,亦即實際觀察值與估計值之差)為縱軸,以估計
值( Yˆ )為橫軸所繪出的散佈圖形,如果呈現出曲線形狀,
表示Y和X之間有非直線的關係存在。此時可利用資料的轉型
來使Y和X具有直線關係。複迴歸模式有兩個或以上的預測變
數,誤差項代表所有預測變數的總和效果,不能分辨出各個預
測變數的個別效果。欲瞭解是否具有直線關係,可以觀察誤差
值的散佈圖形來加以檢查。如果散佈圖的形狀呈曲線關係,如
圖所示,則可能表示有曲線關係。
255
256
(2)誤差項的變異數相等
複迴歸模式的第二項假定是誤差項的變異數要
相等,違反此一假定,即是所謂的「變異數不
等性」(heteroscedasticity)。
要瞭解誤差項的變異數是否相等,可以觀察誤
差值的散佈圖形或利用簡單的統計檢定來加以
檢查。如果散佈圖的形狀呈三角行或菱形,如
圖所示,則可能表示有變異數不等性的現象。
如果有必異數不等性的情形,同樣可用資料轉
換的方法來加以改善。
257
258
(3)誤差項的獨立性
複迴歸的另一項基本假定是每一個預測變數的
數值都是獨立的,都和任何其他的預測變數數值
無關。誤差項的獨立性也可從觀察誤差值的散佈
形狀可知,如圖所示 。資料的轉形,諸如時間序
列模式中第一階差(first difference)、增列指標
變數、或特別設計的迴歸模式等,可用來處理不
符此一假定的情形。
259
260
(4)誤差項分配的常態性
複迴歸模式假定預測變數和準則變數都具常態性。
最簡單的檢視方法是觀察誤差值的直方圖
(histogram),如圖所示,如果直方圖的分配接近
常態分配,通常表示符合此一假定。此法雖然簡單,
但如樣本較小的話,因直方圖的分配不具意義,此
法就不適用了。此時利用常態機率圖(normal
probability plot),以標準化的誤差值與常態分配相
比較。遇到違反常態性假定時,有許多資料轉形的
方法可用來處理這種情形。
261
262
估計迴歸模式
In Stage 4, the researcher must accomplish
three basic tasks:
1. Select a method for specifying the
regression model to be estimated,
2. Assess the statistical significance of the
overall model in predicting the
dependent variable, and
3. Determine whether any of the
observations exert an undue influence
on the results.
263
Variable Selection Approaches:
• Confirmatory (Simultaneous)
• Sequential Search Methods:
 Stepwise (variables not removed once
•
included in regression equation).
 Forward Inclusion & Backward Elimination.
 Hierarchical.
Combinatorial (All-Possible-Subsets)
264
Regression Analysis Terms
•
•
•
Explained variance = R2 (coefficient of determination).
•
Standard Error of the Estimate (SEE) = a measure of the accuracy of
the regression predictions. It estimates the variation of the dependent
variable values around the regression line. It should get smaller as we
add more independent variables, if they predict well.
Unexplained variance = residuals (error).
Adjusted R-Square = reduces the R2 by taking into account the sample
size and the number of independent variables in the regression model
(It becomes smaller as we have fewer observations per independent
variable).
265
Regression Analysis Terms
模式摘要
調過後的
模式
R
R 平方
R 平方
估計的標準誤
a
1
.681
.464
.445
.78794
a. 預測變數:(常數), Work, UnderGPA, GMAT
變異數分析b
模式
1
迴歸
平方和
45.597
殘差
52.772
自由度
3
平均平方和
15.199
85
.621
F 檢定
24.481
顯著性
a
.000
總和
98.369
88
a. 預測變數:(常數), Wo rk, Un derGPA, GMAT
b. 依變數:MBAGPA
係數a
模式
1
(常數)
未標準化係數
B 之估計值
標準誤
.466
1.506
標準化係
數
Beta 分配
t
.310
顯著性
.758
UnderGPA
.063
.120
.042
.524
.602
GMAT
.011
.001
.650
8.159
.000
Work
.093
.031
.238
2.996
.004
a. 依變數:MBAGPA
266
Regression Analysis Terms Continued . . .
•
Total Sum of Squares (SST) = total amount of variation that exists to be
explained by the independent variables. TSS = the sum of SSE and SSR.
•
Sum of Squared Errors (SSE) = the variance in the dependent variable not
accounted for by the regression model = residual. The objective is to obtain
the smallest possible sum of squared errors as a measure of prediction
accuracy.
•
Sum of Squares Regression (SSR) = the amount of improvement in
explanation of the dependent variable attributable to the independent
variables.
267
Assessing Multicollinearity:
The researcher’s task is to:
• Assess the degree of multicollinearity,
• Determine its impact on the results, and
• Apply the necessary remedies if needed.
268
Multicollinearity Diagnostics:
•
Variance Inflation Factor (VIF) – measures how much the variance
of the regression coefficients is inflated by multicollinearity
problems. If VIF equals 0, there is no correlation between the
independent measures. A VIF measure of 1 is an indication of some
association between predictor variables, but generally not enough
to cause problems. A maximum acceptable VIF value would be 10;
anything higher would indicate a problem with multicollinearity.
•
Tolerance – the amount of variance in an independent variable that
is not explained by the other independent variables. If the other
variables explain a lot of the variance of a particular independent
variable we have a problem with multicollinearity. Thus, small
values for tolerance indicate problems of multicollinearity. The
minimum cutoff value for tolerance is typically .10. That is, the
tolerance value must be smaller than .10 to indicate a problem of
multicollinearity.
269
評估解釋能力
•
•
•
•
Coefficient of Determination.
Regression Coefficients
Variables Entered.
Multicollinearity ??
270
驗證迴歸結果
在確認最佳的迴歸模式後,最後的一個步驟是去
驗證迴歸的結果,俾使所獲得的模式能代表母體。
最好的方法是從同一母體再抽出一個新的樣本,然
後有兩種方法來驗證原始模式的效度:一是原始模
式能預測新樣本中的數值,並計算預測的配合度;
一是用新樣本的資料來估計另一個迴歸模式,然後
比較原始模式和新的模式在某些特性(如包含的重
要變數;變數的符號、大小、和相對重要性;預測
的正確性等)上的差異情形。
271
驗證迴歸結果
有許多時候研究人員受限於成本、時間壓力或其
他因素,未能收集新的資料。此時,研究人員可以
將樣本分為估計用的次樣本和驗證用的次樣本兩部
分,然後先利用估計用的樣本來求得迴歸模式,再
利用驗證用的樣本來檢定或驗證迴歸模式。
272
Example. Where to locate a new motor inn?
– La Quinta Motor Inns is planning an expansion.
– Management wishes to predict which sites are likely to be
profitable.
– Several areas where predictors of profitability can be identified
are:
• Competition
• Market awareness
• Demand generators
• Demographics
• Physical quality
273
Example
Profitability
Competition
Rooms
Number of
hotels/motels
rooms within
3 miles from
the site.
Market
awareness
Nearest
Distance to
the nearest
La Quinta inn.
Customers
Office
space
College
enrollment
Operating Margin
Community
Physical
Income
Disttwn
Median
household
income.
Distance to
downtown.
274
Example
• Data were collected from randomly selected 100 inns that
belong to La Quinta, and ran for the following suggested
model:
Margin = b0 b1Rooms b2Nearest b3Office

b4College + b5Income + b6Disttwn
Margin
55.5
33.8
49
31.9
57.4
49
Number
3203
2810
2890
3422
2687
3759
Nearest
4.2
2.8
2.4
3.3
0.9
2.9
Office Space
549
496
254
434
678
635
Enrollment
8
17.5
20
15.5
15.5
19
Income
37
35
35
38
42
33
Distance
2.7
14.4
2.6
12.1
6.9
10.8
275
Model Diagnostics
276
Model Diagnostics
277
Regression Analysis
Margin = 38.139
- 0.008Number
+1.646Nearest
+ 0.020Office Space
+0.212Enrollment
+ 0.413Income
- 0.225Distance
278
Model Assessment
• The model is assessed using three tools:
– The standard error of estimate
– The coefficient of determination
– The F-test of the analysis of variance
• The standard error of estimates participates in
building the other tools.
279
Standard Error of Estimate
• The standard deviation of the error is estimated
by the Standard Error of Estimate:
SSE
se 
n  k 1
• The magnitude of se is judged by comparing it to
y.
280
Standard Error of Estimate
• From the printout, se = 5.5121
• Calculating the mean value of y we have
y  45.739
• It seems se is not particularly small.
• Question:
Can we conclude the model does not fit the data
well?
281
Coefficient of Determination
• The definition is
SSE
R  1
2
(
y

y
)
 i
2
• From the printout, R2 = 0.525
• 52.51% of the variation in operating margin is explained by the
six independent variables. 47.49% remains unexplained.
• When adjusted for degrees of freedom,
Adjusted R2 = 1-[SSE/(n-k-1)] / [SS(Total)/(n-1)]
282
= 49.4%
Testing the Validity of the Model
• We pose the question:
Is there at least one independent variable linearly related
to the dependent variable?
• To answer the question we test the hypothesis
H0: b0 = b1 = b2 = … = bk=0
H1: At least one bi is not equal to zero.
• If at least one bi is not equal to zero, the model has
some validity.
283
Testing the Validity of the La Quinta Inns
Regression Model
• The hypotheses are tested by an ANOVA
MSR/MSE
procedure
ANOVA
df
k = 6
n–k–1 = 93
n-1 = 99
Regression
Residual
Total
SSR
SS
3123.8
2825.6
5949.5
MS
520.6
30.4
F
Significance F
17.14
0.0000
MSR=SSR/k
SSE
MSE=SSE/(n-k-1)
284
Testing the Validity of the La Quinta Inns
Regression Model
[Variation in y] = SSR + SSE.
Large F results from a large SSR. Then, much of the
variation in y is explained by the regression model;
the model is useful, and thus, the null hypothesis
should be rejected. Therefore, the rejection region
is…
F
SSR
SSE
k
n  k 1
Rejection region
F>Fa,k,n-k-1
285
Testing the Validity of the La Quinta Inns
Regression Model
ANOVA
Regression
Residual
Total
Conclusion: There is sufficient evidence to reject
the null hypothesis in favor of the alternative hypothesis.
At least dfone of the b
at least
SSi is not equal
MS to zero.
F Thus,
Significance
F
one independent
variable
6
3123.8is linearly
520.6 related
17.14 to y. 0.0000
This linear
93 regression
2825.6 model
30.4 is valid
99
5949.5
Fa,k,n-k-1 = F0.05,6,100-6-1=2.17
F = 17.136 > 2.17
Also, the p-value (Significance F) = 0.0000
Reject the null hypothesis.
286
Interpreting the Coefficients
• b0 = 38.139. This is the intercept, the value of y when all
the variables take the value zero. Since the data range
of all the independent variables do not cover the value
zero, do not interpret the intercept.
• b1 = – 0.008. In this model, for each additional room
within 3 mile of the La Quinta inn, the operating margin
decreases on average by .008% (assuming the other
variables are held constant).
287
Interpreting the Coefficients
• b2 = 1.646. In this model, for each additional mile that the
nearest competitor is to a La Quinta inn, the operating
margin increases on average by 1.646% when the other
variables are held constant.
• b3 = 0.020. For each additional 1000 sq-ft of office space,
the operating margin will increase on average by .02%
when the other variables are held constant.
• b4 = 0.212. For each additional thousand students the
operating margin increases on average by .212% when the
other variables are held constant.
288
Interpreting the Coefficients
• b5 = 0.413. For additional $1000 increase in median
household income, the operating margin increases
on average by .413%, when the other variables
remain constant.
• b6 = -0.225. For each additional mile to the
downtown center, the operating margin decreases
on average by .225% when the other variables are
held constant.
289
Testing the Coefficients
• The hypothesis for each bi is
H0: bi  0
H1: bi  0
• Excel printout
Intercept
Number
Nearest
Office Space
Enrollment
Income
Distance
Coefficients Standard Error
38.14
6.99
-0.0076
0.0013
1.65
0.63
0.020
0.0034
0.21
0.13
0.41
0.14
-0.23
0.18
Test statistic
b i  bi
t
sb i
t Stat
5.45
-6.07
2.60
5.80
1.59
2.96
-1.26
d.f. = n - k -1
P-value
0.0000
0.0000
0.0108
0.0000
0.1159
0.0039
0.2107
290
Using the Linear Regression Equation
• The model can be used for making predictions by
– Producing prediction interval estimate for the particular value of
y, for a given values of xi.
– Producing a confidence interval estimate for the expected value
of y, for given values of xi.
• The model can be used to learn about relationships
between the independent variables xi, and the dependent
variable y, by interpreting the coefficients bi
291
La Quinta Inns, Predictions
• Predict the average operating margin of an inn at a site
with the following characteristics:
–
–
–
–
–
–
3815 rooms within 3 miles,
Closet competitor .9 miles away,
476,000 sq-ft of office space,
24,500 college students,
$35,000 median household income,
11.2 miles distance to downtown center.
MARGIN = 38.139 - 0.008(3815) +1.646(.9) + 0.020(476)
+0.212(24.5) + 0.413(35) - 0.225(11.2) = 37.1%
292
MBA Program Admission Policy
• The dean of a large university wants to raise the
admission standards to the popular MBA program.
• She plans to develop a method that can predict an
applicant’s performance in the program.
• She believes a student’s success can be predicted by:
– Undergraduate GPA
– Graduate Management Admission Test (GMAT) score
– Number of years of work experience
293
MBA Program Admission Policy
• A randomly selected sample of students who completed
the MBA was selected.
MBA GPA UnderGPA
8.43
6.58
8.15
8.88
.
.
10.89
10.38
10.39
10.73
.
.
GMAT
Work
584
483
484
646
.
.
9
7
4
6
.
.
• Develop a plan to decide which applicant to admit.
294
MBA Program Admission Policy
• Solution
– The model to estimate is:
y = b0 +b1x1+ b2x2+ b3x3+e
y = MBA GPA
x1 = undergraduate GPA [UnderGPA]
x2 = GMAT score [GMAT]
x3 = years of work experience [Work]
– The estimated model:
MBA GPA = b0 + b1UnderGPA + b2GMAT + b3Work
295
Regression Diagnostics
• The conditions required for the model assessment to
apply must be checked.
– Is the error variable normally
Draw a histogram of the residuals
distributed?
– Is the error variance constant?
Plot the residuals versus y^
– Are the errors independent?
Plot the residuals versus the
time periods
– Is multicolinearity (intercorrelation)a problem?
296
Model Diagnostics
297
Model Diagnostics
298
Model Diagnostics
係數a
模式
1
(常數)
未標準化係數
B 之估計值
標準誤
.466
1.506
標準化係
數
Beta 分配
t
.310
顯著性
.758
共線性統計量
允差
VIF
UnderGPA
.063
.120
.042
.524
.602
.998
1.002
GMAT
.011
.001
.650
8.159
.000
.996
1.004
Work
.093
.031
.238
2.996
.004
.998
1.002
a. 依變數:MBAGPA
299
MBA Program Admission Policy –
Model Assessment
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.6808
R Square
0.4635
Adjusted R Square
0.4446
Standard Error 0.788
Observations
89
•
46.35% of the variation in
MBA GPA is explained by
the model.
•
The model is valid
(p-value = 0.0000…)
•
GMAT score and years of
work experience are
linearly related to MBA
GPA.
•
Insufficient evidence of
linear relationship
between undergraduate
300
GPA and MBA GPA.
ANOVA
df
Regression
Residual
Total
3
85
88
SS
45.60
52.77
98.37
MS
15.20
0.62
Coefficients
Standard Error t Stat
Intercept
0.466
1.506
0.31
UnderGPA
0.063
0.120
0.52
GMAT
0.011
0.001
8.16
Work
0.093
0.031
3.00
F
Significance F
24.48
0.0000
P-value
0.7576
0.6017
0.0000
0.0036
Example: Programmer Salary Survey
A software firm collected data for a sample of 20
computer programmers. A suggestion was made
that regression analysis could be used to determine
if salary was related to the years of experience and
the score on the firm’s programmer aptitude test.
The years of experience, score on the aptitude test,
and corresponding annual salary ($1,000s) for a
sample of 20 programmers is shown on the next
slide.
301
Example: Programmer Salary Survey
Exper.
Score
Salary
Exper.
Score
Salary
4
7
1
5
8
10
0
1
6
6
78
100
86
82
86
84
75
80
83
91
24
43
23.7
34.3
35.8
38
22.2
23.1
30
33
9
2
10
5
6
8
4
6
3
3
88
73
75
81
74
87
79
94
70
89
38
26.6
36.2
31.6
29
34
30.1
33.9
28.2
30
302
Example: Programmer Salary Survey
• Multiple Regression Model
Suppose we believe that salary (y) is related to the years of
experience (x1) and the score on the programmer aptitude test
(x2) by the following regression model:
y = b 0 + b 1 x1 + b 2 x2 + e
where
y = annual salary ($1,000)
x1 = years of experience
x2 = score on programmer aptitude test
303
Example: Programmer Salary Survey
• Multiple Regression Equation
Using the assumption E (e ) = 0, we obtain
E(y ) = b0 + b1x1 + b2x2
• Estimated Regression Equation
b0, b1, b2 are the least squares estimates of b0, b1, b2
Thus
y = b0 + b1x1 + b2x2
304
Example: Programmer Salary Survey
• Solving for the Estimates of b0, b1, b2
Least Squares
Output
Input Data
x1
x2 y
4 78 24
7 100 43
.
.
.
.
.
.
3 89 30
Computer
Package
for Solving
Multiple
Regression
Problems
b0 =
b1 =
b2 =
R2 =
etc.
305
Model Diagnostics
306
Model Diagnostics
307
Example: Programmer Salary Survey
• Computer Output
The regression is
Salary = 3.17 + 1.40 Exper + 0.251 Score
Predictor
Constant
Exper
Score
s = 2.419
Coef
3.174
1.4039
.25089
R-sq = 83.4%
Stdev
6.156
.1986
.07735
t-ratio
.52
7.07
3.24
p
.613
.000
.005
R-sq(adj) = 81.5%
308
Example: Programmer Salary Survey
• Computer Output (continued)
Analysis of Variance
SOURCE
Regression
Error
Total
DF
2
17
19
SS
500.33
99.46
599.79
MS
F
P
250.16 42.76 0.000
5.85
309
Example: Programmer Salary Survey
• F Test
–
Hypotheses
H0 : b 1 = b 2 = 0
Ha: One or both of the parameters
is not equal to zero.
–
–
–
Rejection Rule
For a = .05 and d.f. = 2, 17: F.05 = 3.59
Reject H0 if F > 3.59.
Test Statistic
F = MSR/MSE = 250.16/5.85 = 42.76
Conclusion
We can reject H0.
310
Example: Programmer Salary Survey
• t Test for Significance of Individual Parameters
–
Hypotheses
H0 : b i = 0
Ha: bi = 0
–
–
Rejection Rule
For a = .05 and d.f. = 17, t.025 = 2.11
Reject H0 if t > 2.11
Test Statistics
b1 1.4039
b2 .25089

 7.07

 3.24
sb1 .1986
sb
.07735
2
–
Conclusions
Reject H0: b1 = 0
Reject H0: b2 = 0
311
Qualitative Independent Variables
• In many situations we must work with qualitative independent
variables such as gender (male, female), method of payment
(cash, check, credit card), etc.
• For example, x2 might represent gender where x2 = 0
indicates male and x2 = 1 indicates female.
• In this case, x2 is called a dummy or indicator variable.
• If a qualitative variable has k levels, k - 1 dummy variables
are required, with each dummy variable being coded as 0 or
1.
• For example, a variable with levels A, B, and C would be
represented by x1 and x2 values of (0, 0),
(1, 0), and (0,1), respectively.
312
Example: Programmer Salary Survey (B)
As an extension of the problem involving the
computer programmer salary survey, suppose that
management also believes that the annual salary is
related to whether or not the individual has a
graduate degree in computer science or information
systems.
The years of experience, the score on the
programmer aptitude test, whether or not the
individual has a relevant graduate degree, and the
annual salary ($1,000) for each of the sampled 20
programmers are shown on the next slide.
313
Example: Programmer Salary Survey (B)
Exp. Score
4
78
7
100
1
86
5
82
8
86
10
84
0
75
1
80
6
83
6
91
Degr.
No
Yes
No
Yes
Yes
Yes
No
No
No
Yes
Salary
24
43
23.7
34.3
35.8
38
22.2
23.1
30
33
Exp.
9
2
10
5
6
8
4
6
3
3
Score Degr. Salary
88
Yes
38
73
No
26.6
75
Yes
36.2
81
No
31.6
74
No
29
87
Yes
34
79
No
30.1
94
Yes
33.9
70
No
28.2
89
No
30
314
Example: Programmer Salary Survey (B)
• Multiple Regression Equation
E(y
^ ) = b0 + b1x1 + b2x2 + b3x3
• Estimated Regression Equation
y = b0 + b1x1 + b2x2 + b3x3
where
y = annual salary ($1,000)
x1 = years of experience
x2 = score on programmer aptitude test
x3 = 0 if individual does not have a grad. degree
1 if individual does have a grad. degree
Note: x3 is referred to as a dummy variable.
315
Model Diagnostics
316
Model Diagnostics
317
Example: Programmer Salary Survey (B)
• Computer Output
The regression is
Salary = 7.95 + 1.15 Exp + 0.197 Score + 2.28 Deg
Predictor
Constant
Exp
Score
Deg
s = 2.396
Coef
Stdev
t-ratio
7.945
7.381
1.08
1.1476
.2976
3.86
.19694
.0899
2.19
2.280
1.987
1.15
R-sq = 84.7%
R-sq(adj) = 81.8%
p
.298
.001
.044
.268
318
Example: Programmer Salary Survey (B)
• Computer Output (continued)
Analysis of Variance
SOURCE
Regression
Error
Total
DF
SS
3 507.90
16
91.89
19 599.79
MS
F
P
169.30 29.48 0.000
5.74
319
Diagnostics: Multicolinearity
• Example: Predicting house price
– A real estate agent believes that a house selling price can be
predicted using the house size, number of bedrooms, and lot
size.
– A random sample of 100 houses was drawn and data recorded.
Price
124100
218300
117800
.
.
Bedrooms
3
4
3
.
.
H Size
1290
2080
1250
.
.
Lot Size
3900
6600
3750
.
.
– Analyze the relationship among the four variables
320
Model Diagnostics
321
Model Diagnostics
322
Diagnostics: Multicolinearity
• The proposed model is
PRICE = b0 + b1BEDROOMS + b2H-SIZE +b3LOTSIZE + e
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.7483
R Square
0.5600
Adjusted R Square
0.5462
Standard Error 25023
Observations
100
The model is valid, but no
variable is significantly related
to the selling price ?!
ANOVA
df
Regression
Residual
Total
3
96
99
SS
76501718347
60109046053
136610764400
Coefficients Standard Error
Intercept
37718
14177
Bedrooms
2306
6994
House Size
74.30
52.98
Lot Size
-4.36
17.02
MS
25500572782
626135896
t Stat
2.66
0.33
1.40
-0.26
F
Significance F
40.73
0.0000
P-value
0.0091
0.7423
0.1640
0.7982
323
Diagnostics: Multicolinearity
• Multicolinearity is found to be a problem.
Price
Price
Bedrooms
H Size
Lot Size
1
0.6454
0.7478
0.7409
Bedrooms H Size
1
0.8465
0.8374
1
0.9936
Lot Size
1
• Multicolinearity causes two kinds of difficulties:
– The t statistics appear to be too small.
– The b coefficients cannot be interpreted as “slopes”.
324
Model Diagnostics
係數a
模式
1
(常數)
未標準化係數
B 之估計值
標準誤
37717.595
14176.742
標準化係
數
Beta 分配
t
2.661
顯著性
.009
共線性統計量
允差
VIF
Bedrooms
2306.081
6994.192
.042
.330
.742
.282
3.540
HouseSize
74.297
52.979
.865
1.402
.164
.012
83.067
Lotsize
-4.364
17.024
-.154
-.256
.798
.013
78.841
a. 依變數:Price
325
Durbin - Watson Test:
Are the Errors Autocorrelated?
• This test detects first order autocorrelation
between consecutive residuals in a time series
• If autocorrelation exists the error variables are
not independent
n
Residual at time i
d

(ei  ei 1 ) 2
i 2
n

ei 2
i 1
The range of d is 0  d  4
326
Positive First Order Autocorrelation
+
+
+
Residuals
+
0
+
+
Time
+ +
Positive first order autocorrelation occurs when
consecutive residuals tend to be similar. Then,
the value of d is small (less than 2).
327
Negative First Order Autocorrelation
Residuals
+
+
+
+
+
+
+
0
Time
Negative first order autocorrelation occurs when
consecutive residuals tend to markedly differ.
Then, the value of d is large (greater than 2).
328
One tail test for Positive First Order
Autocorrelation
• If d<dL there is enough evidence to show that
positive first-order correlation exists
• If d>dU there is not enough evidence to show that
positive first-order correlation exists
• If d is between dL and dU the test is inconclusive.
First order
correlation
exists
dL
Inconclusive
test
Positive first order correlation
Does not exists
dU
329
One Tail Test for Negative First Order
Autocorrelation
• If d>4-dL, negative first order correlation exists
• If d<4-dU, negative first order correlation does
not exists
• if d falls between 4-dU and 4-dL the test is
inconclusive.
Negative first order correlation
does not exist
Inconclusive
test
4-dU
Negative
first order
correlation
exists
4-dL
330
Two-Tail Test for First Order
Autocorrelation
• If d<dL or d>4-dL first order autocorrelation exists
• If d falls between dL and dU or between 4-dU and
4-dLthe test is inconclusive
• If d falls between dU and 4-dU there is no evidence
for first order autocorrelation
First order
correlation
exists
0
dL
First order
correlation
does not
exist
Inconclusive
test
dU
2
First order
correlation
does not
exist
Inconclusive
test
4-dU
First order
correlation
exists
4-dL
4
331
Testing the Existence of Autocorrelation, Example
• Example
– How does the weather affect the sales of lift tickets in a ski
resort?
– Data of the past 20 years sales of tickets, along with the total
snowfall and the average temperature during Christmas week in
each year, was collected.
– The model hypothesized was
TICKETS=b0+b1SNOWFALL+b2TEMPERATURE+e
– Regression analysis yielded the following results:
332
The Regression Equation –
Assessment (I)
The model seems to be very poor:
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.3465
R Square
0.1200
Adjusted R Square
0.0165
Standard Error
1712
Observations
20
• R-square=0.1200
• It is not valid (Signif. F =0.3373)
• No variable is linearly related to Sales
ANOVA
df
Regression
Residual
Total
Intercept
Snowfall
Tempture
2
17
19
SS
6793798
49807214
56601012
Coefficients Standard Error
8308.0
903.73
74.59
51.57
-8.75
19.70
MS
3396899
2929836
F
Signif. F
1.16
0.3373
t Stat P-value
9.19 0.0000
1.45 0.1663
-0.44 0.6625
333
Diagnostics: The Error Distribution
The errors histogram
7
6
5
4
3
2
1
0
-2.5
-1.5
-0.5
0.5
1.5
2.5
More
The errors may be
normally distributed
334
Diagnostics: Heteroscedasticity
Residual vs. predicted y
3000
2000
1000
0
-10007500
-2000
8500
9500
10500
11500
12500
-3000
-4000
It appears there is no problem of heteroscedasticity
(the error variance seems to be constant).
335
Diagnostics: First Order Autocorrelation
Residual over time
3000
2000
1000
0
-1000 0
-2000
-3000
-4000
5
10
15
20
25
The errors are not independent!!
336
Diagnostics: First Order Autocorrelation
Durbin-Watson Statistic
-2793.99
-1723.23
d = 0.5931
The residuals
-2342.03
-956.955
-1963.73
.
.
Test for positive first order autocorrelation:
n=20, k=2. From the Durbin-Watson
table we have: dL=1.10, dU=1.54.
The statistic d=0.5931
Conclusion: Because d<dL , there is
sufficient evidence to infer that positive
first order autocorrelation exists.
337
The Modified Model: Time Included
The modified regression model
TICKETS=b0+ b1SNOWFALL+ b2TEMPERATURE+ b3TIME+e
• All the required conditions are met for this model.
• The fit of this model is high R2 = 0.7410.
• The model is valid. Significance F = .0001.
• SNOWFALL and TIME are linearly related to ticket sales.
• TEMPERATURE is not linearly related to ticket sales.
338
參考資料
•統計學(謝邦昌 ):CH12 – CH17