Download X - York University

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
The Central Limit Theorem (CLT)
Given a large random sample of size n from a population with mean  and
standard deviation , then the sample mean X is approximately normally
distributed with mean and standard deviation given by
 = 
X
 =  / n
X
Note: 1. n >30 is usually large enough for the CLT to apply.
2. If the population from which we sample is normal thenX is exactly
normally distributed with mean and standard deviation as above for any
sample size.
(1)
Empirical Rule for X
Consider a sample of size n from a population with mean  and standard deviation
. Suppose X is normal ( or approximately normal), with  =  and  = /n
X
X
(This would be the case if the population is normal or if the sample size is large).
Find the probability that X will be within (a) 2 of  (b) 3
of  .
X
X
(a) P( X will be within 2
of  )
X
=
(2)
(b) P( X will be within 3 of  )
X
=
In general the statement “X will be within k of  “ means that X lies between
X
-k
X
and
 +k 
X
If X is normal ( or approximately normal), then
P( X will be within k of  ) = P(-k < Z <k)
X
(3)
Z Confidence Interval
Suppose we are given the following:
Normal Population: Scores on a standardized test.
Population Mean :  (unknown)
Population S.D.:  =1.5
To estimate  we will take a srs of size n =25 and use X as our estimator. Recall
that since the population is normal,
X is normally distributed with  =  and  =  /n = 1.5/5 =.3
X
X
We would like to be able to express this estimate in the form X  E or
( X – E, X + E ). Here E is some error which determines the accuracy of our
estimate. Let’s take E = 2 
for now .
X
Thus we have
For any given sample this interval may or may not contain the true mean  . It
would be useful to know what the probability is that this interval covers  .
If the interval covers the true mean  then  is somewhere in the interval above so
thatX is in fact within 2 
( =0.6) of  .
X
Thus P [ (X - 2  , X + 2  ) covers ]
X
X
= P (X is within 2  of  )
X
=
=
(4)
To make the probability above a nice number, .95, we should replace 2 by 1.96.
Thus we can say
“ For 95% of all samples of size n =25, the interval (X - 1.96  , X + 1.96  )
X
X
will cover the true value of  .”
Or,
“ For 95% of all samples of size n =25, X will be within 1.96 of the true
X
population mean .”
The 95% value is called the LEVEL OF CONFIDENCE. This tells us the
probability the interval will cover .
The 1.96
= .588 is called the margin of error. This tells us how accurate X is
X
(i.e. how closeX will be to  for 95% of all samples).
The interval (X - 1.96  , X + 1.96  ) is called a 95%
X
X
Z-CONFIDENCE INTERVAL.
The simulation below will illustrate how confidence intervals work.
(5)
MTB > random 25 c1-c40;
SUBC> norm 10 1.5.
MTB > zint 95 1.5 c1-c40.
[ The first two command lines select 40 random samples each of size n =25 from a
normal distribution with  =10 and  = 1.5. The third command line forms the 95%
Z-CONFIDENCE INTERVAL for each sample]
Confidence Intervals (The assumed sigma = 1.5)
Variable
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33
C34
C35
C36
C37
C38
C39
C40
N
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
Mean
10.459
9.826
10.388
9.741
10.441
10.331
8.941
10.205
10.163
10.009
10.455
10.365
10.626
10.090
10.339
10.208
10.356
9.943
10.015
9.924
10.037
9.490
9.972
10.330
9.635
9.292
10.053
9.484
10.666
9.896
9.942
10.100
9.483
9.691
10.390
10.569
9.813
9.905
10.442
9.945
StDev
1.661
1.486
1.600
1.297
1.766
1.637
1.264
1.627
1.560
1.619
1.787
1.220
1.475
1.677
1.103
1.480
1.508
1.388
1.318
1.473
1.271
1.345
1.484
1.644
1.609
1.558
1.072
1.726
1.402
1.640
1.583
1.657
1.496
1.623
1.369
1.178
1.326
1.489
1.405
1.919
(6)
SE Mean
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
0.300
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
95.0% CI
9.871, 11.047)
9.238, 10.414)
9.800, 10.976)
9.153, 10.329)
9.853, 11.029)
9.743, 10.919)
8.353,
9.529)
9.617, 10.793)
9.575, 10.751)
9.421, 10.597)
9.867, 11.043)
9.777, 10.953)
10.038, 11.214)
9.502, 10.678)
9.751, 10.927)
9.620, 10.796)
9.768, 10.944)
9.355, 10.531)
9.427, 10.603)
9.336, 10.512)
9.449, 10.625)
8.902, 10.078)
9.384, 10.560)
9.742, 10.918)
9.047, 10.223)
8.704,
9.880)
9.465, 10.641)
8.896, 10.072)
10.078, 11.254)
9.308, 10.484)
9.354, 10.530)
9.512, 10.688)
8.895, 10.071)
9.103, 10.279)
9.802, 10.978)
9.981, 11.157)
9.225, 10.401)
9.317, 10.493)
9.854, 11.030)
9.357, 10.533)
QUESTIONS
1.
(a) In theory, how many of the above intervals would you expect to cover the
true population mean  (=10)?
(b) In fact how many actually do?
2. Suppose you selected 40 samples of size n =25 from a real population ( where
typically the population mean and standard deviation are unknown).
(a) Could you form a 95% Z- confidence interval for each sample?
Explain.
(b) If we knew  and formed forty 95% Z-confidence intervals, then we
would expect 38 intervals to cover the population  . Could you tell which intervals
will cover  ? Explain.
(7)
Note: (i) 100(1-)% Z-confidence interval of  is given by
X  Z/2  ; where
X
 =  /n
X
(ii) For 95% Z –confidence interval ,  = .05. hence 95% Z-confidence interval of 
is
X  1.96 
;
X
where  =  /n
X
(iii) 99% Z-confidence interval of  is
X  2.576 
;
X
where  =  /n
X
(iv) 90% Z-Confidence Interval of  is
X  1.645 
;
X
where  =  /n
X
(8)
Sample Size for Desired margin of Error
Problem: Suppose you wish to estimate a population mean  with a specified
margin of error m and level of confidence 100(1-)% . What sample size should
be used?
z 
Solution: We know that m = 2
n
Now we solve this equation for n.
m2 =
 nm2 =

n=

n=
[z/2 /m]2
Note: In practice, we will round the sample size to the next whole number.
Example 6.5; Page 425: Tim Kelley of Example 6.2 has decided that he wants his
estimate of his monthly weight accurate to within 2 or 3 pounds with 95%
confidence. How many measurements must he take to achieve these margins of
error?
 For this example it is known that   3.
(i)
For 95% confidence and margin of error of 2 pounds we have
 z 
n = 2
 m

2
  1.96  3  2
 =
=8.6  9 (always round up to the next whole number)
  2 

(ii) For 95% confidence and margin of error of 3 pounds we have
 z 
n=  2
 m

2
2

 =  1.96  3  =3.8  4.

 3 

(9)
Problem 6.16; page 432: You are planning a survey of starting salaries for recent
liberal arts major graduates from your college. From a pilot study you estimate that
the standard deviation is about $8000. What sample size do you need to have a
margin of error equal to $ 500 with 95% confidence?
(10)
STATISTICAL INFERENCE
Let us begin with a review of some basic definitions.
POPULATION: The set of all measurements or objects of interest in a particular
study.
If the entire population were available for analysis we would know everything about
it. However, in practice one cannot know the entire population because it is either
too expensive, or simply impossible or impractical to examine each member. Thus a
sample from the population is used to obtain information about the population.
SAMPLE: A subset of the population.
The sample picked should be “representative” of the population from which it
comes and should avoid any bias which might skew our view of the population. One
way to achieve this is to use a SIMPLE RANDOM SAMPLE (srs) i.e. a sample
chosen in such a way that each member of the population has an equal chance of
being chosen.
INFERENTIAL STATISTICS: deals with procedures which use the sample to draw
conclusions about the population (from which it was drawn). The procedures of
interest to us are CONFIDENCE INTERVALS and HYPOTHESIS TESTS.
In particular we will be interested in drawing conclusions about certain
characteristics of the population. Such characteristics are known as POPULATION
PARAMETRS. Examples of such characteristics are a POPULATION MEAN
(denoted by the Greek letter  ) and a POPULATION PROPORTION ( denoted by
the letter p).
EXAMPLE: Consider the population of weights ( in kg) of all newborn babies in
Canada for a particular year. In this case, the POPULATION MEAN  is the
average weight of all newborns in the population. An investigator may want to use a
simple random sample of these weights to determine if there is sufficient evidence to
answer questions like:
Is  > 3.2 kg? or Is  < 3.2 kg? or Is   3.2 kg?
EXAMPLE: Consider the population of all lakes in Nova Scotia. A biologist may be
interested in the following POPULATION PROPORTION:
p = the proportion of all lakes in Nova Scotia that are seriously affected by acid rain.
She may want to use a simple random sample of lakes from this population to
determine if there is sufficient evidence to answer questions like:
Is p>.7 ? or, Is p<.7? or, Is p.7 ?
When drawing conclusions about a population using information from a sample it is
important to realize that one can NEVER be absolutely certain the conclusion is
correct. This is because a sample, though it may be “representative” of the
population, only contains part of all the information contained in the population.
(11)
HYPOTHESIS TESTING
Example: A graduate student claims that over 70% of the lakes in Nova Scotia have
been seriously affected by acid rain. To justify this claim she proposes the following
`test`.
“ Choose a simple random sample of 15 lakes in Nova Scotia. If 11 or more of the
sampled lakes are seriously affected by acid rain, the claim is justified.”
Formally, we set up this test as follows.
First notice that the population of interest to this graduate student is the set of all
lakes in Nova Scotia. The parameter of interest in her investigation is
p=the true proportion of all lakes in Nova Scotia affected by the acid rain [p=the
unknown population proportion].
NULL HYPOTHESIS
ALTERNATIVE HYPOTHESIS
What we want to reject.
The viewpoint opposite to Ha
Research Hypothesis.
What we want to prove.
H0 :
Ha:
TEST STATISTIC ( evidence from the sample used to make a decision)
X=
Distribution of X :
Now in conducting this test we should make use of the fact that large values of X
would
be consistent with the
_______________________ hypothesis that p
.7.
How large an X? Let’s pick some number c and decide that if Xc we conclude
that_____________. Thus if X<c we must conclude that___________. The value c is
called a CRITICAL VALUE . In this example, the graduate student has decided to
use c =11. Her method for making a decision can be described as follows.
(12)
REJECTION OR CRITICAL REGION (rule for making a decision)
Now suppose she conducts her study and that she observes that X  11. Then she
would claim to have shown that Ha : p > .7 is true. If you had to use her study to
make a policy decision, the first question you should ask is
“ What is the probability that her claim is wrong? That is, what is the probability of
getting X  11 when in fact H0 : p  .7 is true ?”
Let’s find out by doing the calculations below.
Suppose that H0 : p  .7 is true
p = .5
Probability of a wrong decision
P(Reject H0 / H0 is true)
P(X  11 p =.5) = 1 – P (X  10)
=
=
p=.6
P ( X  11 p =.6) = 1 – P ( X  10)
=
p=.7
=
P(X  11 p = .7) = 1 – P(X  10)
=
=
The error of rejecting H0 when in fact H0 is true is called a TYPE 1 ERROR. Notice
that in this example the largest probability of making a type 1 error is
_____________ and that it occurs when the value of p is _____________ ( that is on
the boundary between H0 and Ha). The largest probability of making a type 1 error
is called the LEVEL OF SIGNIFICANCE or TYPE 1 ERROR RATE of the test and
is denoted by the Greek letter .
(13)
Conversely suppose that the graduate student observed X < 11 (i.e. X10), thus
leading to the claim H0 : p  .7 is true. In this case you should ask
“ What is the probability that her claim is wrong? That is, what is the probability of
getting X < 11 when in fact Ha : p > .7 is true?”
Let’s find out by doing the calculations below.
Suppose that
Probability of a wrong decision
Probability of a correct
Ha: p>.7 is true
P (Accept H0 Ha true)
decision P (Reject H0Ha true)
p=.8
P(X< 11p=.8)
P(X11 p=.8)
=P(X 10)
= 1 – P( X  10)
=
=
=
p=.9
P( X< 11 p =.9)
P(X  11 p =.9)
= P (X  10)
=1- P(X  10)
=
=
=
The error of accepting H0 when in fact Ha is true is called a TYPE II ERROR. For a
particular value of p say p1 in the alternative ( i.e. p1 >.7) the probability of making
a type II error is called the TYPE II ERROR RATE evaluated at p = p1. This
probability is denoted by (p1). Thus,
(p1) = P ( Accept H0  p = p1 in Ha)
Also for a particular value of p say p1 in the alternative (i.e. p1 > .7) we can calculate
the probability of a correct decision ( see the last column of the table above). The
probability of making a correct decision, that is, rejecting H0 when in fact Ha is true
is called the POWER OF THE TEST AGAINST THE ALTERNATIVE p1 in Ha
and is denoted K(p1). Thus
K(p1) = P (Reject H0  p = p1 in Ha)
Notice that K(p1) and (p1) are related by K(p1) = 1 -  (p1). If in fact Ha is true,
power is a measure of a test’s ability to detect this. For example if in fact p were
actually .8(.9), this test will detect this with probability___________(__________).
(14)
A good test that is one in whose results we can be confident of , will be one in which
the probabilities of the type I and type II errors are small.
The ideas discussed above are refer to the ERROR STRUCTURE of a test. A
summary is provided below.
DECISION
Accept H0 ( Do not reject H0)
Reject H0 ( Accept Ha)
ACTUAL SITUATION
H0 is True
Ha is True ( H0 is false)
Correct Decision
Type II Error
Type I Error
Correct Decision
QUESTION: For the student’s test above, state in words the consequence of
making a
(a) Type I Error:
(b) Type II Error:
ERROR RATES AND POWER OF A TEST
TYPE I ERROR
Reject H0 when H0 is true
TYPE II ERROR
Accept H0 when Ha is true
POWER AGAINST the
ALTERNATIVE p1
P(Type I Error) = P (Reject H0 H0 true).
The largest possible probability of a type
I error is denoted by  and is called the
LEVEL OF SIGNIFICANCE or TYPE I
ERROR RATE of the test. In calculating
 = P ( reject H0  H0 true ) , use the
value of p right on the boundary
between H0 and Ha
(p1) = P (Type II Error)
= P ( Accept H0 p = p1 in Ha)
K(p1) = P ( Reject H0  p = p1 in Ha )
= 1 -  (p1)
In the case that Ha is true, power is a
measure of the sensitivity of the test i.e.
the ability of the test to detect that Ha is
true.
(15)
Changing the Rejection Region
Question: If we use the same sample size, how can we modify this test in order to
reduce the type I error rate  ?
Suppose we take c =14, so we reject H0 if X  14. What is  ?
In this case what will happen to the type II error rate (p1) and the power K(p1) ?
NOTE: Ideally, we would like  and (p) to be zero and K(p) to be 1; but for fixed n
decreasing  causes (p) to increase and K(p) to decrease.
NOTE: The only way to decrease both  and (p) is to increase the sample size.
(16)
The P-value
Consider the test: H0: p  .70, Ha: p > .7, n=30; Reject H0 if X  26.
Suppose we conduct the test and observe X to be x0 = 28. According to the rejection
region we would reject H0 . We would in fact have rejected H0 even if our critical
value had been 28. But with a critical value of 28, the type I error rate would be
smaller.
The P-value is the smallest type I error rate at which one can reject H0 on the basis
of the observed outcome x0 . It is obtained by replacing the critical value ‘c’ by x0 in
the calculation of the type I error rate.
P-value = P (X  x0  H0 is true)
For example, consider the cases where x0 is 28 and x0 is 24.
Type I error rate 
P(X26 p =.7)
P-value when x0 = 28
P ( X  28  p =.7)
P-value when x0 = 24
P(X  24  p =.7)
= 1 – P (X  25  p = .7)
= 1 – P (X  27  p =.7)
= 1 – P ( X  23  p =.7)
= 1 - .9698
= 1- .9979
= 1 - .8405
=.0302
=.0021
= .1595
Notice
If x0 is in the rejection region the p-value   .
If x0 is not in the rejection region then the p-value is >  .
Thus it is clear that we can conduct our test at  = .03 without using a rejection
region. We just have to calculate the P-value and use the following rule.
If the P-value   then reject H0 .
If the P-value >  then do not reject H0.
(17)
Summary: Hypothesis Testing
Concept
Left-Tailed Test
Right-Tailed Test
Hypotheses
H0: p  p0 , Ha : p < p0
H0: p  p0 , Ha : p > p0
Critical Region
Reject H0 if X  c
Reject H0 if X  c
Type I Error Rate 
P(Reject H0H0 true)
Type II Error Rate (p1)
P(Accept H0 p =p1 in H a)
Power K(p1)
P(Reject H0 p =p1 in Ha)
P-Value
P(Xc p =p0)
P(X c p = p0)
P(X>c p = p1)
P(x<c p =p1)
P(Xcp=p1)
or, 1-(p1)
P(Xx0 p =p0)
P(Xc p= p1)
or, 1 -(p1)
P(Xx0 p =p0)
P-value Decision Rule
Reject H0 if the P-value  
Note: A similar theory also applies to a Two-tailed test, i.e., a test of
H0: p =p0, Ha: p  p0
While we will conduct such tests in our applications, we will not discuss the theory
here.
An analogy of statistical hypotheses
In practice we use  = .01 or  = .05. Thus to reject H0 we need strong evidence.
In our judicial system, we use the phrase innocent until proven guilty beyond a
reasonable doubt. We may define null and alternative hypotheses as follows:
H0: defendant is innocent
Ha: defendant is guilty.
To prove defendant is guilty we need strong evidence.
(18)
CUMULATIVE BINOMIAL PROBABILITIES : P(Xx)
n
15
x
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
20
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
30
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
0.1
.2059
.5490
.8159
.9444
.9873
.9978
.9997
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
.1216
.3917
.6769
.8670
.9568
.9887
.9976
.9996
.9999
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
.0424
.1837
.4114
.6474
.8245
.9628
.9742
.9922
.9980
.9995
.9999
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
0.2
0.3
.0352 .0047
.1671 .0353
.3980 .1268
.6482 .2969
.8358 .5155
.9389 .7216
.9819 .8689
.9958 .9500
.9992 .9848
.9999 .9963
1.000 .9993
1.000 .9999
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
.0115 .0008
.0692 .0076
.2061 .0355
.4114 .1071
.6296 .2375
.8042 .4164
.9133 .6080
.9679 .7723
.9900 .8867
.9974 .9520
.9994 .9829
.9999 .9949
1.000 .9987
1.000 .9997
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
.0012 .0000
.0105 .0003
.0442 .0021
.1227 .0093
.2552 .0302
.4275 .0766
.6070 .1595
.7608 .2814
.8713 .4315
.9389 .5888
.9744 .7304
.9905 .8407
.9969 .9155
.9991 .9599
.9998 .9831
.9999 .9936
1.000 .9979
1.000 .9994
1.000 ..9998
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
1.000 1.000
0.4
.0005
.0052
.0271
.0905
..2173
.4032
.6098
.7869
.9050
.9662
.9907
.9981
.9997
1.000
1.000
1.000
.0000
.0005
.0036
.0160
.0510
.1256
.2500
.4159
.5956
.7553
.8725
.9435
.9790
.9935
.9984
.9997
1.000
1.000
1.000
1.000
1.000
.0000
.0000
.0000
.0003
.0015
.0057
.0172
.0435
.0940
.1763
.2915
.4311
.5785
.7145
.8246
.9029
.9519
.9788
.9917
.9971
.9991
.9998
1.000
1.000
1.000
1.000
1.000
1.000
1.000
0.5
.0000
.0005
.0037
.0176
.0592
.1509
.3036
.5000
.6964
.8491
.9408
.9824
.9963
.9995
1.000
1.000
.0000
.0000
.0002
.0013
.0059
.0207
.0577
.1316
.2517
.4119
.5881
.7483
.8684
.9423
.9793
.9941
.9987
.9998
1.000
1.000
1.000
.0000
.0000
.0000
.0000
.0000
.0002
.0007
.0026
.0081
.0214
.0494
.1002
.1808
.2923
.4278
.5722
.7077
.8192
.8998
.9506
.9786
.9919
.9974
.9993
.9998
1.000
1.000
1.000
1.000
(19)
0.6
.0000
.0000
.0003
.0019
.0093
.0338
.0950
.2131
.3902
.5968
.7827
.9095
.9729
.9948
.9995
1.000
.0000
.0000
.0000
.0000
.0003
.0016
.0065
.0210
.0565
.1275
.2447
.4044
.5841
.7500
.8744
.9490
.9840
.9964
.9995
1.000
1.000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0002
.0009
.0029
.0083
.0212
.0481
.0971
.1754
.2855
.4215
.5689
.7085
.8237
.9060
.9565
.9828
.9943
.9985
.9997
1.000
1.000
0.7
.0000
.0000
.0000
.0001
.0007
.0037
.0152
.0500
.1311
.2784
.4845
.7031
.8732
.9647
.9953
1.000
.0000
.0000
.0000
.0000
.0000
.0000
.0003
.0013
.0051
.0171
.0480
.1133
.2277
.3920
.5836
.7625
.8929
.9645
.9924
.9992
1.000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0002
.0006
.0021
.0064
.0169
.0401
.0845
.1593
.2696
.4112
.5685
.7186
.8405
.9234
.9698
.9907
.9979
.9997
0.8
.0000
.0000
.0000
.0000
.0000
.0001
.0008
.0042
.0181
.0611
.1642
.3518
.6020
.8329
.9648
1.000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0001
.0006
.0026
.0100
.0321
.0867
.1958
.3704
.5886
.7939
.9308
.9885
1.000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0001
.0002
.0009
.0031
.0095
.0256
.0611
.1287
.2392
.3930
.5725
.7448
.8773
.9558
.9895
0.9
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0003
.0022
.0127
.0556
.1841
.4510
.7941
1.000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0001
.0004
.0024
.0113
.0432
.1330
.3231
.6083
.8784
1.000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0000
.0001
.0005
.0020
.0078
.0258
.0732
.1755
.3526
.5886
.8163
x
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
z Test for a Population Mean
To test the hypothesis H0 :  = 0 based on an SRS of size n from a population with
unknown mean  and known standard deviation  , compute the test statistic
_
z =
( x  0 )
/ n
In terms of a standard normal random variable Z, the P-value for a test of H0
against
Ha :  > 0 is P(Z  z)
Ha :  < 0 is P(Z  z)
Ha :   0 is 2P(Z  z)
These P-values are exact if the population distribution is normal and are
approximately correct for large n in other cases (Page 445-Text Book).
Confidence Intervals and Two-Sided Tests: A level  two-sided significance test
rejects a hypothesis H0: =0 exactly when the value 0 falls outside a level 1-
confidence interval for .
(20)
To illustrate the test we consider the following problem.
Problem 6.45 (Page 455): The Survey of Study Habits and Attitudes (SSHA) is a
psychological test that measures the motivation, attitude toward school, and study
habits of students. Scores range from 0 to 200. The mean score for U.S. college
students is about 115, and the standard deviation is about 30. A teacher who
suspects that older students have better attitudes toward school gives the SSHA to
20 students who are at least 30 years of age. Their mean score is x = 135.2.
(a) Assuming that  = 30 for the population of older students, carry out a test of
H0 :  = 115, Ha :  > 115.
Report the P-value of your test, and state your conclusion clearly.
(b) Your test in (a) required two important assumptions in addition to the
assumption that the value of  is known. What are they? Which of these
assumptions is most important to the validity of your conclusion in (a).
Solution: Given: n=
, x =
,
=
Assume  = .05
Ha :  > 115; therefore this is right sided test.
(a) (i)
(ii) The test statistic in this case is
z=
(iii) p-value = P (Z  3.01 )
=
(iv) Decision:
(21)
(v) Concluding Sentence:
(b) Assumptions: (i)
(ii)
Example 6.16; Page 450: Bottles of a popular cola drink are supposed to contain 300
milliliters (ml) of cola. There is some variation from bottle to bottle because the
filling machinery is not perfectly precise. The distribution of the contents is normal
with standard deviation  = 3 ml. A student who suspects that the bottler is
underfilling measures the contents of six bottles. The results are
299.4 297.7 301.0 298.9 300.2 297.0
Is this convincing evidence that the mean contents of cola bottles is less than the
advertised 300 ml? The hypotheses are
H0 :  = 300,
Ha :  < 300
(22)
Problem 6.46; Page 456: The mean yield of corn in the United States is about 120
bushels per acre. A survey of 40 farmers this year gives a sample mean yield of x =
123.8 bushels per acre. We want to know whether this is good evidence that the
national mean this year is not 120 bushels per acre. Assume that the farmers
surveyed are an srs from the population of all commercial growers and that the
standard deviation of the yield in this population is  = 10 bushels per acre. Give the
P-value for the test of
H0 :  =120,
Ha :  120.
Are you convinced that the population mean is not 120 bushels per acre? Is your
conclusion correct if the distribution of corn yields is some what non-normal? Why?
(23)
Problem 6.87; Page 481: You have an srs of size n = 16 from a normal distribution
with  = 1. You wish to test
H0 :  = 0
Ha :  > 0.
You decide to reject H0 if x > 0 and the accept H0 otherwise.
(a) Find the probability of a Type I error , that is, the probability that your test
rejects H0 when in fact  = 0.
(b) Find the probability of a Type II error when  = 0.2. This is the probability
that your test accepts H0 when in fact  = 0.2.
(c) Find the probability of a Type II error when  = 0.6.
(24)
Problem 6.84; Page 480: Example 6.16 discusses a test about the mean contents of
cola bottles. The hypotheses are
H0 :  = 300
Ha :  < 300 .
The sample size is n =6, and the population is assumed to have a normal distribution
with  = 3. A 5% significance test rejects H0 if z  -1.645, where the test statistic z is
z=
x  300
.
3
6
Power calculations help us see how large a shortfall in the bottle contents the test
can be expected to detect.
(a) Find the power of this test against the alternative  = 298.
(b) Find the power of this test against the alternative  = 294.
(c) Is the power against  = 296 higher or lower than the value you
found in (b) ? Explain why this result makes sense?
(25)
The t-distribution
The t-distribution depends on a single parameter. This parameter is called its
degrees of freedom (df). If sampling is done from a normal distribution whose mean
is  and standard deviation , then
X - 
Z = 
 /n
follows standard normal distribution. Since,  in practice is mostly unknown;
therefore, we can replace it by its estimate s. The random variable
X - 
T = 
S /n
follows t-distribution with n-1 degrees of freedom.
Sketch of t-distribution In comparison with standard normal distribution, the tdistribution has more area in the tails while the standard normal distribution has
more area in the middle.
t-curve approaches Z-curve if df is large.
(26)
T-Interval: Confidence Interval for the Mean  of a Normal Population
( unknown)
If a random sample X1 , X2 . . . Xn is chosen from a normal distribution; then
100(1-)% Confidence Interval of  is
X  t/2 SE
where:
df for t is n-1,
SE = s/n = standard error of X ( the estimated sd of X),
X =
s2 =
s=
Margin of Error: E = t/2 SE = t/2 s/n
Level of Confidence ( Reliability) : 100(1-)%
Notes: 1. For all n, t/2 > z/2 .
2. For df = , t/2 = z/2 , which are the entries at the bottom of the t –table.
3. For large n (n >30), the normality assumption may be ignored because of the
Central Limit Theorem.
4. The estimate of , X is the mid-point of the CI and the margin of error is
one half the width of the CI.

L
X
U
Thus,
(27)
X = (L+U)/2
and
E = (U – L)/2
Example: In a health study the birth weights of a random sample of 100 newborns
from mothers with a low socioeconomic status in a large US city was recorded. The
sample yielded a mean of 3.21 kg with a standard deviation of 0.71 kg.
(a) Find a 90% confidence interval for the true mean birth weight of newborns
from mothers with a low socioeconomic status.
(b) Interpret the confidence interval.
Solution: Here we wish to estimate
 = mean birth weight of all newborns from mothers with a low socioeconomic
status in this US city.
Given:
n=
x =
[estimate of  ]
s=
[estimate of  ]
Since n > 30, it is not necessary that the population be normal ( due to the CLT).
For a 90% CI, t/2 =
=
, df = n –1 = 99
x  t/2 s/n
=
=
or,
(c) x = _________ estimates the true population mean  with margin of error
E =____________ and level of confidence (Reliability)____________.
The level of confidence gives the proportion of intervals found this way that
would cover .
(28)
Note: The interpretation of a confidence interval as given in the example above
is the popular interpretation often heard on television or reported in
newspapers. A mathematically precise interpretation of the confidence interval
for this example would be “ Prior to sampling there was a .90 probability that
the confidence interval to be formed would contain the true population mean 
“.
Example: For the data in the example above, find a 95% confidence interval for
the true mean birth weight of newborns from mothers with a low socioeconomic
status.
Solution:
Recall,
n = 100, x = 3.21,
For a 95% CI,
t/2 =
s = 0.71 .
=
,
df = n –1 = 99
x  t/2 s/n
=
=
or,
Interpretation:
x = _________ estimates the true population mean  with margin of error
E =____________ and level of confidence (Reliability)____________.
(29)
Example: For the data in the example above, find a 99% confidence interval for the
true mean birth weight of newborns from mothers with a low socioeconomic status.
Solution:
Recall,
n = 100, x = 3.21,
For a 99% CI,
t/2 =
s = 0.71 .
=
,
df = n –1 = 99
x  t/2 s/n
=
=
or,
Interpretation:
x = _________ estimates the true population mean  with margin of error
E =____________ and level of confidence (Reliability)____________.
Question: Considering these three examples, if the level of confidence is
increased and all other things remain the same, the width of the confidence
interval will_______________ .
(30)
Example: A study was conducted to determine the effect of acid rain on the lake
water in an industrial region of the country. The data below gives the pH levels
from a random sample of 10 lakes from this region. ( It was assumed that the
sample came from a normal distribution). Minitab was used to find a 95%
confidence interval for the mean pH level for all lakes in this region.
C1: 6.6
7.1
7.3
6.7
6.8
6.2
6.5
5.9
6.9
6.3
MTB > tint 95 c1
One-Sample T: C1
Variable
C1
N
10
Mean
6.630
StDev
0.424
SE Mean
0.134
(
95.0% CI
6.326,
6.934)
From the Minitab output answer the following:
(a) What is the 95% confidence interval of  ?
(b) What is the estimate of  and the estimated standard deviation of this
estimate?
(c) What is the margin of error E and level of confidence (reliability) for the
estimate of  ?
(31)
The One-Sample t Test
Suppose that an SRS of size n is drawn from a population having unknown mean .
To test the hypothesis H0 :  = 0 based on an SRS of size n, compute the one-sample
t statistic
t=
x  0
s/ n
In terms of a random variable T having the t(n-1) distribution, the P-value for a test
of H0 against
Ha :  > 0 is P(T  t)
Ha :  < 0 is P(T  t)
Ha :   0 is 2P(T  t )
These P-values are exact if the population distribution is normal and are
approximately correct for large n in other cases (Page 496-Text Book).
(32)
To illustrate the test we consider the following example.
Example: A random sample of 120 high school graduates were given an IQ test. The
sample mean IQ was 103.21 with a standard deviation of 16.18. Test at  = .10 if
there is sufficient evidence to conclude that the mean of population from which the
sample comes exceeds 100.
Solution: Given: n =120, x = 103.21, s =16.18;  = .10
(i)
Ha:  > 100
(ii)
t=
(iii)
p- value =
(iv) Decision:
(v) Concluding Sentence:
(33)
Example: A psychological test, used to assess an individual’s ability to appraise
other people, was given to a random sample of 12 supervisors in a large corporation.
Their scores are given below.
64
97
73
71
68
74
60
78
60
74
73
75
Is there sufficient evidence at  = .05 to conclude that the mean score for the
population of supervisors is below 75?
Solution: Given: n=
,x =
, s=
(i) Ha :  < 75; therefore, this is a left sided test.
(ii) t =
(iii) p-value =
(iii)Decision:
(iv) Concluding Sentence:
(34)
,  = .05
Example: A manufacturing process is supposed to produce ball bearings for use in
industry with a diameter of 2cm. A random sample of 40 ball bearings was chosen
and their diameters were measured. Mean and standard deviation of this random
sample is given below;
n =40 , x = 1.9991, s = .0089.
Test the hypothesis Ha :   2 at  = .05.
(i) Ha :   2; therefore, this is a two sided test.
(ii) t =
(iii)p-value =
(iv) Decision:
(v) Concluding Sentence:
(35)
PAIRED SAMPLE DESIGN
Example: A consumer group wishes to compare two brands of tire, brand A and
brand B with respect to tire wear.
- One of each brand of tire was randomly assigned to the rear wheels of 8 cars.
- The cars were then driven a specified number of miles
- The amount of wear on each tire was recorded.
Amount of wear
Car
1
2
3
4
5
6
7
8
A(xi)
9.4
11.8
9.1
8.3
9.0
10.6
9.0
9.1
B(yi)
9.3
13.0
9.7
8.8
9.0
10.4
10.0
10.1
di = xi - yi
0.1
-1.2
-0.6
-0.5
0.0
0.2
-1.0
-1.0
In this design the brands of tire were paired ( matched or blocked) according to car,
driver and route driven. Thus the differences ‘di’ in tire wear are most likely due to
differences in brand and not in these other variables.
Matched pairs t procedures
Suppose, for example, using our ‘tire data’ we wish to test at  = .05 if “on average”
brand B wears more than brand A. The underlying assumption for the t-test for
matched pairs is that di `s form a srs from a normal distribution.
Car
1
di = xi - yi
0.1
2
-1.2
3
-0.6
4
-0.5
5
0.0
6
0.2
7
-1.0
8
-1.0
di2
(36)
(i)
H0 :  = 0, Ha :  < 0;  = .05.
(ii)
t=
(iii)
p-value =
(iv)
Decision:
(v)
There is sufficient evidence at  = .05 to conclude that tire B
wears more than tire A.
d 0
s
n
=
(37)
PROBLEM 7.42 ; PAGE 522:The table below gives the pretest and posttest score on
the MLA listening test in Spanish for 20 high school Spanish teachers who attended
an intensive summer course in Spanish. The setting is identical to the one described
in Example 7.7.
Subject
Pretest
Posttest
1
30
29
2
28
30
3
31
32
4
26
30
5
20
16
6
30
25
7
34
31
8
15
18
9
28
33
10
20
25
11
30
32
12
29
28
13
31
34
14
29
32
15
34
32
16
20
27
17
26
28
18
25
29
19
31
32
20
29
32
(a) We hope to show that attending the institute improves listening
skills. State an appropriate H0 and Ha . Be sure to identify the
parameters appearing in the hypotheses.
(b) Make a graphical check for outliers or strong skewness in the
data that you will use in your statistical test, and report your
conclusions on the validity of the test.
(c) Carry out a test. Can you reject H0 at the 5% significance level?
At the 1% significance level.
(d) Give a 90 % confidence interval for the mean increase in
listening score due to attending the summer institute.
(38)
MINITAB OUTPUT
—————
12/11/03 11:08:46 AM
————————————————————
Welcome to Minitab, press F1 for help.
MTB > Retrieve "D:\PCDataSets\MINITAB\Ch07\Ex07_042.mtp";
SUBC>
Portable.
Retrieving worksheet from file: D:\PCDataSets\MINITAB\Ch07\Ex07_042.mtp
# Worksheet was saved on Wed Apr 24 2002
Results for: Ex07_042.mtp
MTB > let c4= c3-c2
MTB > prin c1-c4
Data Display
Row
Student
Pretest
Posttest
Posttest-Pretest
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
30
28
31
26
20
30
34
15
28
20
30
29
31
29
34
20
26
25
31
29
29
30
32
30
16
25
31
18
33
25
32
28
34
32
32
27
28
29
32
32
-1
2
1
4
-4
-5
-3
3
5
5
2
-1
3
3
-2
7
2
4
1
3
MTB >
* NOTE
gstd
* Character graphs are obsolete.
* NOTE
* Standard Graphics are enabled.
Professional Graphics are disabled.
Use the GPRO command to enable Professional Graphics.
(39)
MTB > hist c4
Histogram
Histogram of Posttest
Midpoint
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
Count
1
1
1
1
2
0
2
3
4
2
2
0
1
N = 20
*
*
*
*
**
**
***
****
**
**
*
MTB > stem c4;
SUBC> trim.
Stem-and-Leaf Display: Posttest-Pretest
Stem-and-leaf of Posttest
Leaf Unit = 1.0
2
4
6
8
(7)
5
1
-0
-0
-0
0
0
0
0
N
= 20
54
32
11
11
2223333
4455
7
MTB > boxp c4
Boxplot
----------------------------------I
+
I-----------------------------------+---------+---------+---------+---------+---------+-Posttest
-5.0
-2.5
0.0
(40)
2.5
5.0
7.5
MTB > ttes 0 c4;
SUBC> alte 1.
One-Sample T: Posttest-Pretest
Test of mu = 0 vs mu > 0
Variable
Posttest-Pre
N
20
Mean
1.450
Variable
95.0% Lower Bound
Posttest-Pre
0.211
MTB > tint 90 c4
StDev
3.203
T
2.02
SE Mean
0.716
P
0.029
One-Sample T: Posttest-Pretest
Variable
Posttest-Pre
N
20
Mean
1.450
StDev
3.203
(41)
SE Mean
0.716
(
90.0% CI
0.211,
2.689)
The Two-Sample t Significance Test
Suppose that an srs of size n1 is drawn from a normal population with unknown
mean 1 and that an independent srs of size n2 is drawn from another normal
population with unknown mean 2. To test H0: 1 = 2, compute the two -sample t
statistic
x  x2
t= 1
s12 s 22

n1 n2
and use P-values or critical values for the t(k) distribution, where the degrees of
freedom k are either approximated by software or are the smaller of n1 – 1 and
n2 –1. The P-values for a test of H0 against
(i)
Ha : 1 > 2 is P ( T t)
(ii)
Ha : 1 < 2 is P ( T t)
(iii)
Ha : 1  2 is 2P ( T t )
A 100(1-)% confidence interval of 1 - 2 is
( x1  x 2 )  t
(42)
2
s12 s 22

n1 n2
Software approximation for the degrees of freedom
For the The Two-Sample t Significance Test the statistical software uses the
following formula for obtaining the degrees of freedom
df =
 s12 s 22 
  
 n1 n2 
2
2
1  s12 
1  s 22 
  
 
n1  1  n1 
n2  1  n2 
2
Problem 7.70; Page 548: Does cocaine use by pregnant women cause their babies to
have low birth weight? To study this question, birth weights of babies of women
who tested positive for cocaine/crack during a drug-screening test were compared
with the birth weights for women who either tested negative or were not tested, a
group we call “other”. Here are the summary statistics. The birth weights are
measured in grams.
x
Group
n
s
Positive test
134
2733
599
Other
5974
3118
672
(a) Formulate appropriate hypotheses and carry out the test of significance for
these data.
(b) Give a 95% confidence interval for the mean difference in birth weights.
(c) Discuss the limitations of the study design. What do you believe can be
concluded from this study?
 (a) We assume  = 0.05. Let 1 be the mean of the Positive group and 2 be is
the mean of the other group.
(i)
H0 : 1 = 2; Ha : 1 < 2.
x1  x 2
(ii)
t=
(iii)
P-value =
(43)
s12 s 22

n1 n2
(b)
(iv)
Decision:
(v)
Concluding Sentence:
95% confidence interval of
( x1  x 2 )  t
2
1 - 2 is
s12 s 22

n1 n2
(c)
(44)
Example 7.14; Page 530: An educator believes that new reading activities in the
classroom will help elementary school pupils improve some aspects of their reading
ability. She arranges for a third-grade class of 21 students to take part in these
activities for an eight week period. A control classroom of 23 third-graders follows
the same curriculum without the activities. At the end of the eight weeks, all
students are given a Degree of Reading Power (DRP) test, which measures the
aspects of reading ability that the treatment is designed to improve. The data appear
in Table 7.3
TABLE 7.3 DRP scores for third-graders
Treatment Group
24 61 59 46 43 44 52 43
58 67 62 57 71 49 54 43
53 57 49 56 33
Control Group
42 33 46 37 43 41 10 42
55 19 17 55 26 54 60 28
62 20 53 48 37 85 42
 The summary statistics are
Group
Treatment
Control
n
21
23
x
51.48
41.52
s
11.01
17.15
Because we hope to show that the treatment (Group 1) is better than the control
(Group 2), the hypotheses are
(ii)
(vi)
H0 : 1 = 2; Ha : 1 > 2. Assume  = 0.05.
t=
x1  x 2
(vii)
s12 s 22

n1 n2
P-value =
(viii)
Decision:
(v) Concluding Sentence:
(45)
MINITAB OUTPUT
————— 12/16/03 10:27:45 AM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > twos c1 c2;
SUBC> alte 1.
Two-Sample T-Test and CI: Treatment, Control
Two-sample T for Treatment vs Control
Treatmen
Control
N
21
23
Mean
51.5
41.5
StDev
11.0
17.1
SE Mean
2.4
3.6
Difference = mu Treatment - mu Control
Estimate for difference: 9.95
95% lower bound for difference: 2.69
T-Test of difference = 0 (vs >): T-Value = 2.31
P-Value = 0.013
DF = 37
MTB > twos 95 c1 c2
Two-Sample T-Test and CI: Treatment, Control
Two-sample T for Treatment vs Control
Treatmen
Control
N
21
23
Mean
51.5
41.5
StDev
11.0
17.1
SE Mean
2.4
3.6
Difference = mu Treatment - mu Control
Estimate for difference: 9.95
95% CI for difference: (1.23, 18.68)
T-Test of difference = 0 (vs not =): T-Value = 2.31
P-Value = 0.027
DF = 37
MTB > twos 90 c1 c2
Two-Sample T-Test and CI: Treatment, Control
Two-sample T for Treatment vs Control
N
Treatmen
Control
21
23
51.5
41.5
11.0
17.1
Mean
StDev SE Mean
2.4
3.6
Difference = mu Treatment - mu Control
Estimate for difference: 9.95
90% CI for difference: (2.69, 17.22)
T-Test of difference = 0 (vs not =): T-Value = 2.31
(46)
P-Value = 0.027
DF = 37
Example 7.16; Page 533: The Chapin Insight Test is a psychological test designed to
measure how accurate the subject appraises other people. The possible scores on the
test range from 0 to 41. During the development of the Chapin test, it was given to
several different groups of people. Here are the results for male and female college
students majoring in the liberal arts:
Group
1
2
Sex
Male
Female
n
133
162
x
25.34
24.94
s
5.05
5.44
Do these data support the contention that female and male students differ in
average social insight?
 We assume  = 0.05.
(i)
H0 : 1 = 2; Ha : 1  2.
(ii)
t=
(iii)
P-value =
(iv)
Decision:
x1  x 2
s12 s 22

n1 n2
(v) Concluding Sentence:
(47)
The Pooled Two-Sample t Procedures
Suppose that an srs of size n1 is drawn from a normal population with unknown
mean 1 and that an independent srs of size n2 is drawn from another normal
population with unknown mean 2. Suppose also that the two populations have the
same standard deviation. A 100(1-)% confidence interval for 1 - 2 is
1
1
( x1  x2 )  t s p

2
n1 n2
The degrees of freedom for the t density curve is n1 +n2 –2 and the pooled variance
(n1  1) s12  (n2  1) s 22
s 
n1  n2  2
2
p
To test the hypothesis H0: 1 = 2, compute the two -sample t statistic
x1  x 2
t=
1
1
sp

n1 n2
In terms of a random variable T having the t(n1 +n2-2) distribution, the P-value for
a test of H0 against
(i)
Ha : 1 > 2 is P ( T t)
(ii)
Ha : 1 < 2 is P ( T t)
(iii)
Ha : 1  2 is 2P ( T t )
(48)
Example: In an experiment to compare the breaking strengths of two alloys “X”
and “Y”, random samples of beams of each alloy were chosen and a strength test
applied. The data is listed below.
Alloy X: 81 87 81 85 83 83 83 86 88 78 83 80 83 86 84
Alloy Y: 76 81 81 80 85 84 79 82 74 82 79 78 81 81 80 79 78 76 79 82
Is there sufficient evidence that on average the breaking strength of alloy X exceeds
that of alloy Y? Assume the sample come from normal distributions with equal
variances.(assume  = 0.05).

n1 = 15, x1  83.40 ,
s12  7.4
n2 = 20, x2  79.85 ,
s 2p 
sp
s 22  7.187
(n1  1) s12  (n2  1) s 22
=
n1  n2  2
1
1
=

n1 n2
(i) H0 : 1 = 2, Ha: 1 > 2 (  =0.05)
x1  x 2
(ii) t=
1
1

n1 n2
(iii) P-value =
sp
(iv) Decision:
(v) Concluding Sentence:
(49)
Note: We obtain 99% confidence interval of 1 - 2.
df = 33  30.
t  t 0.005 =
2
Therefore, 99% confidence interval of 1 - 2 is
( x1  x2 )  t s p
2
(50)
1
1

n1 n2
Minitab Output
————— 12/16/03 10:38:27 AM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > twos c1 c2;
SUBC> alte 1;
SUBC> pooled.
Two-Sample T-Test and CI: Alloy X, Alloy Y
Two-sample T for Alloy X vs Alloy Y
Alloy X
Alloy Y
N
15
20
Mean
83.40
79.85
StDev
2.72
2.68
SE Mean
0.70
0.60
Difference = mu Alloy X - mu Alloy Y
Estimate for difference: 3.550
95% lower bound for difference: 1.991
T-Test of difference = 0 (vs >): T-Value = 3.85
Both use Pooled StDev = 2.70
P-Value = 0.000
DF = 33
MTB > twos 99 c1 c2;
SUBC> pooled.
Two-Sample T-Test and CI: Alloy X, Alloy Y
Two-sample T for Alloy X vs Alloy Y
Alloy X
Alloy Y
N
15
20
Mean
83.40
79.85
StDev
2.72
2.68
SE Mean
0.70
0.60
Difference = mu Alloy X - mu Alloy Y
Estimate for difference: 3.550
99% CI for difference: (1.032, 6.068)
T-Test of difference = 0 (vs not =): T-Value = 3.85
Both use Pooled StDev = 2.70
(51)
P-Value = 0.001
DF = 33
Example: A cigarette manufacturer claims that “on average” his cigarettes (
brand A) have lower tar content than those of his nearest competitor (brand B) .
An experiment comparing the tar content of two brands (in mg) yielded the
following results.
Brand A
Brand B
Sample Size
20
20
Sample Mean
9.57
10.16
Sample St. Dev.
1.01
1.11
Is there sufficient evidence at  = 0.05 to support the manufacturer’s claim?

Brand A :
Brand B:
x1 
n1 =
x2 
n2 =
(n1  1) s12  (n2  1) s 22
=
s 
n1  n2  2
2
p
sp
1
1
=

n1 n2
(i) H0 : 1 = 2, Ha: 1 < 2 (  =0.05)
(iv)
x1  x 2
t=
(v)
1
1

n1 n2
P-value =
(vi)
Decision:
sp
(v) Concluding Sentence:
(52)
s12 
s 22 
Note: For the above problem we now test the hypothesis that the two brands
have different tar.
(i) H0 : 1 = 2, Ha: 1 2 (  =0.05)
(ii)
x1  x 2
t=
sp
1
1

n1 n2
(iii)
P-value =
(iv)
Decision:
(v) Concluding Sentence:
(53)
NORMAL APPROXIMATION FOR COUNTS AND PROPORTIONS
An srs of size n is drawn from a population having population proportion p of
X
successes. Let X be the number of successes in the sample and pˆ  is the sample
n
proportion of successes. If n is large; then
(i) X is approximately N (np,
np(1  p) ).
(ii) p̂ is approximately N (p,
p(1  p)
). [This result is on Page 376 in the text
n
book]
100(1-)% CONFIDENCE INTERVAL FOR p
For large n, p̂ is approximately N (p,
(-z/2 <
p(1  p)
). Therefore,
n
pˆ  p
p (1  p )
n
< z/2) = 1- 
A simple calculation would show that the above equation is equivalent to the
following
P( p̂ -z/2
p(1  p)
< p < p̂ +z/2
n
p(1  p)
) = 1- 
n
The standard deviation of p̂ is given by
 p̂ =
p(1  p)
n
Since, p in practice is unknown; therefore, we replace it by its estimate p̂
and define standard error of sample proportion as follows:
SE p̂ =
pˆ (1  pˆ )
n
(54)
An approximate 100(1-)% confidence interval for p is given by
p̂  z/2 SE p̂
This is the traditional confidence interval of p. Unfortunately, modern computer
studies reveal that confidence intervals based on this approach can be quite
inaccurate, even for large samples. Therefore, we will use a simple adjustment that
works very well in practice. An estimate of p is defined by
~p = X  2 .
n4
We call it the Wilson estimate. This estimate was first suggested by Edwin Bidwell
Wilson in 1927. It can be shown that the distribution of ~p is close to the normal
p(1  p)
. To get a confidence
n4
interval, we estimate p by ~p in this standard deviation to get the standard error of
~p . Here is the final result.
distribution with mean p and standard deviation
An approximate 100(1-)% confidence interval for p is
~p  z/2 SE ~p
X 2
where, ~p =
, and
n4
SE ~p =
~
p (1  ~
p)
n4
The margin of error is m= z/2 SE ~p . In practice we will use this confidence interval
when the sample size is at least n =5 and the confidence level is 90%, 95% or 99%.
(55)
Example 8.1; Page 574: Alcohol abuse has been described by college presidents as
the number one problem on campus and it is an important cause of death in young
adults. How common is it? A survey of 17,096 students in U.S. four-year colleges
collected information on drinking behaviour and alcohol related problems. The
researchers defined “frequent binge drinking” as having five or more drinks in a
row three or more times in the past two weeks. According to this definition, 3314
students were classified as frequent binge drinkers. The Wilson estimate of the
proportion of drinkers is
~p = 3314  2  0.194
17096  4
SE ~p =
~
p (1  ~
p)
n4
.
Therefore, 95% confidence interval of p is
~p  z/2 SE ~p =
Interpretation: ~p  0.194 estimates the true proportion p with margin of error
0.006 and level of confidence (reliability) 95%.
(56)
Sample Size for Desired Margin of Error
100(1-)% confidence interval of p is ~p  z/2 SE ~p .Therefore, the margin of error is
m= z/2 SE ~p = z/2
~
p (1  ~
p)
.
n4
A simple calculation gives
 z
n +4 =  2
 m

2

 ~p (1- ~p ).


The value of ~p is not known until we gather the data. Therefore, we must guess a
value to use in the calculations. We call the guessed value p  . Therefore, sample size
formula is given as follows:
 z
n +4 =  2
 m

2

 p  (1-p  )


1
. The margin of error will
2
be less than or equal to m if p  is chosen to be 0.5. The sample size formula in this
case is given by
If guessed value is not available then we may use p  =
2
 z 
n +4 =  2  .
 2m 


(57)
Problem 8.25, Page 586: Land’s Beginning is a company that sells its merchandise
through the mail. It is considering buying a list of addresses from a magazine. The
magazine claims that at least 25% of its subscribers have high incomes ( they define
this to be household income in excess of $100,000). Land’s Beginning would like to
estimate the proportion of high-income people on the list. Checking income is very
difficult and expensive but another company offers this service. Land’s Beginning
will pay to find incomes for an srs of people on the magazine’s list. They would like
the margin of error of the 95% confidence interval for the proportion to be 0.05 or
less. Use the guessed value p  = 0.25 to find the required sample size.
(58)
Large-Sample Significance test for a Population Proportion
Draw an SRS of size n from a large population with unknown proportion p of
successes. To test the hypothesis H0: p = p0 , compute the z statistic
z=
pˆ  p 0
p 0 (1  p 0 )
n
In terms of a standard normal random variable Z, the approximate P-value for a
test of H0 against
Ha : p > p0 is P(Z  z)
Ha : p < p0 is P(Z  z)
Ha : p  p0 is 2P(Z  z)
In practice we will use this test if np0 > 10 and n(1-p0)>10. This test is given on Page
575 in the Text Book.
(59)
Problem 8.13; Page 584: In each of the following cases state whether or not the
normal approximation to the binomial should be used for a significance test on the
population proportion:
(a) n =10, H0: p =0.6
(b) n =100, H0: p =0.4
(c) n = 2000, H0: p =0.996
(d) n = 500, H0 ; p =0.25.

(60)
Example: In a random sample of 30 students from a university it was found that
22 lived in residence. Test at  = .05 if there is sufficient evidence to conclude
that over 60% of the student body lives on campus.
Here, Population = the student body at a university.
p = the true proportion of the student body that lives on campus.
(i)
H0 : p  .6, Ha: p  .6
n = 30, p0 = .6;
Check: np0 =
, n(1-p0) =
pˆ  p 0
(ii)
z=
(iii)
P-value =
(iv)
Decisión:
(v)
Concluding Sentence:
(61)
p 0 (1  p 0 )
n
=
Example: The article “ Statistical Evidence of Discrimination” ( J. Amer. Stat.
Assoc. (1982): 773-83) discussed the court case ‘swain vs. Alabama’(1965), in which
it was alleged that there was discrimination against blacks in grand jury selection.
Census data suggested that 25% of those eligible for grand jury selection were
black, yet a random sample of 1050 called to appear for possible duty yielded only
177 Blacks. Test at  = .01 if this data strongly supports a case for discrimination.
 Here, Population = all individual in Alabama who are eligible for
grand jury duty.
p= the true proportion of blacks who will be selected for grand
jury duty from this population.
(i) H0 : p
, Ha : p
n = 1050 , p0 = .
;
Check: np0 =
, n(1-p0) =
pˆ  p 0
(ii)
z=
(iii)
P-value =
(iv)
Decisión:
(v)
Concluding Sentence:
p 0 (1  p 0 )
n
=
(62)
MINITAB OUTPUT
————— 12/22/03 5:17:36 PM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > POne 1050 177;
SUBC>
Test 0.25;
SUBC>
Alternative -1;
SUBC>
UseZ.
Test and CI for One Proportion
Test of p = 0.25 vs p < 0.25
Sample
1
X
177
N
1050
Sample p
0.168571
95.0% Upper Bound
0.187575
Z-Value
-6.09
P-Value
0.000
MTB > POne 1050 177;
SUBC>
Test 0.25;
SUBC>
UseZ.
Test and CI for One Proportion
Test of p = 0.25 vs p not = 0.25
Sample
1
X
177
N
1050
Sample p
0.168571
95.0% CI
(0.145927, 0.191216)
Z-Value
-6.09
P-Value
0.000
99.0% CI
(0.138812, 0.198331)
Z-Value
-6.09
P-Value
0.000
MTB > POne 1050 177;
SUBC>
Confidence 99.0;
SUBC>
Test 0.25;
SUBC>
UseZ.
Test and CI for One Proportion
Test of p = 0.25 vs p not = 0.25
Sample
1
X
177
N
1050
Sample p
0.168571
(63)
Example : People often claim that the probability a newborn child will be a male
differs significantly from 0.50 . In a sample of 812 births recorded at Hotel Dieu
Hospital in Windsor, Ontario, 403 were found to be male. Test at  = .05 if there is
sufficient evidence to support this claim.
 Here: Population = all infants born at Hotel Dieu Hospital in Windsor, Ontario.
p= the true probability a new born will be a male.
(i)
H0 : p
Ha : p
n = 1050 , p0 = .
;
Check: np0 =
n(1-p0) =
pˆ  p 0
(ii)
z=
(iii)
P-value =
(iv)
Decisión:
(v)
Concluding Sentence:
p 0 (1  p 0 )
n
=
(64)
Confidence Intervals for Comparing Two Proportions
Choose an srs of size n1 from a large population having proportion p1 of successes
and an independent srs of size n2 from another population having proportion p2 of
successes. An approximate 100(1-)% confidence interval of p1 – p2 is
(~
p1  ~
p2 )  z SE D~
2
X 1
~
,
p1  1
n1  2
X 1
~
p2  2
n2  2
are the Wilson estimates of the population proportions and the standard error of the
difference is
SE D~ 
~
p1 (1  ~
p1 ) ~
p (1  ~
p2 )
 2
n1  2
n2  2
The margin of error is
m =  z SED~
2
Use this method when both sample sizes are at least 10 and the confidence level is
90%, 95%, or 99%.
(65)
Example 8.8; Page 589: In the binge-drinking study, data were also summarized by
gender
Population
1 (men)
2 (women)
Total
n
7180
9916
17096
X
1630
1684
3314
We obtain 95% confidence interval of p1  p2 .

SE D~ 
=
~
p1 (1  ~
p1 ) ~
p (1  ~
p2 )
 2
n1  2
n2  2
0.2270.773  0.1700.830
7180  2
9916  2
The 95% confidence interval of p1  p2 is
(66)
= 0.00622
Wilson estimates
~
p1  0.227
~
p2  0.170
Significance Tests for Comparing Two Proportions
To test the hypothesis H0: p1 = p2
Compute the z statistic
z=
pˆ 1  pˆ 2
SE Dp
where the pooled standard error is
SE Dp 
pˆ (1  pˆ )(
1
1
 )
n1 n2
and where
pˆ 
X1  X 2
n1  n2
In terms of a standard normal random variable Z, the P-value for a test of H0
against
Ha : p1 > p2 is P (Zz)
Ha : p1 < p2
is P (Zz)
Ha : p1  p2 is 2P (Z z )
This z test is based on the normal approximation to the binomial distribution. As a
general rule, we will use it when the number of successes and the number of failures
in each of the sample is at least 5.
(67)
Example : In a medical study of the effectiveness of the drug Timolol in preventing
angina attacks, 400 patients were randomly allocated to receive a daily dosage of the
drug Timolol or a placebo for 28 weeks. The number of angina-free patients was
then recorded for each group.
Drug
Placebo
Sample Size
210(n1)
190(n2)
No. of angina free patients
59(x1)
25(x2)
Test at  = .01 if a significantly higher proportion of patients become angina free
with the drug Timolol that with the placebo.

Let, p1 = the true proportion of patients who become angina free with the drug
Timolol.
p2 = the true proportion of patients who become angina free with the placebo.
(i) H0 : p1 = p2
Ha : p1 > p2 ( = .01)
(ii) p̂1 
p̂2 
=
pˆ 
SE Dp 
z=
pˆ (1  pˆ )(
X1  X 2
=
n1  n2
1
1
 )=
n1 n2
pˆ 1  pˆ 2
=
SE Dp
(iii) P-value =
(iv) Decision:
(v) Concluding Sentence:
(68)
=
MINITAB OUTPUT
————— 12/19/03 3:09:30 PM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > PTwo 210 59 190 25;
SUBC>
Confidence 95;
SUBC>
Alternative 1;
SUBC>
Pooled.
Test and CI for Two Proportions
Sample
1
2
X
59
25
N
210
190
Sample p
0.280952
0.131579
Estimate for p(1) - p(2): 0.149373
95% lower bound for p(1) - p(2): 0.0843364
Test for p(1) - p(2) = 0 (vs > 0): Z = 3.66
P-Value = 0.000
MTB > PTwo 210 59 190 25;
SUBC>
Confidence 99.
Test and CI for Two Proportions
Sample
1
2
X
59
25
N
210
190
Sample p
0.280952
0.131579
Estimate for p(1) - p(2): 0.149373
99% CI for p(1) - p(2): (0.0475258, 0.251221)
Test for p(1) - p(2) = 0 (vs not = 0): Z = 3.78
P-Value = 0.000
MTB > PTwo 210 59 190 25;
SUBC>
Confidence 95.
Test and CI for Two Proportions
Sample
1
2
X
59
25
N
210
190
Sample p
0.280952
0.131579
Estimate for p(1) - p(2): 0.149373
95% CI for p(1) - p(2): (0.0718770, 0.226870)
Test for p(1) - p(2) = 0 (vs not = 0): Z = 3.78
(69)
P-Value = 0.000
Example: Two telephone polls conducted by a news organization in September of
1993 and 1994 asked a random sample of American adults the following question:
“ Do you favour a proposal to stop providing government health benefits and public
education to illegal immigrants and their children” ( Source: Time Magazine,
October 1994)
September 1993
800(n1)
376(x1)
Sample Size
Number in favour
September 1994
800(n2)
440(x2)
Test at  = .01 if the proportion of American adults in favour of this proposal has
increased from September 1993 to September 1994.

(i) H0 : p1 = p2
Ha : p1 < p2 ( = .01)
(ii) p̂1 
p̂2 
=
pˆ 
SE Dp 
z=
pˆ (1  pˆ )(
X1  X 2
=
n1  n2
1
1
 )=
n1 n2
pˆ 1  pˆ 2
=
SE Dp
(iii) P-value =
(iv) Decision:
(v) Concluding Sentence:
(70)
=
Example 8.9; Page 592: Are men and women college students equally likely to be
frequent binge drinkers? We examine the survey data in Example 8.8 (Page 589) to
answer this question. Here is the summary data
Population
1 (men)
2 (women)
Total

(i) H0 : p1 = p2
n
7180
9916
17096
Ha : p1  p2
X
1630
1684
3314
( = .01)
p̂2 
(ii) p̂1 
pˆ 
SE Dp 
z=
pˆ (1  pˆ )(
X1  X 2
=
n1  n2
1
1
 )=
n1 n2
pˆ 1  pˆ 2
=
SE Dp
(iii) P-value =
(iv) Decision:
(v) Concluding Sentence:
(71)
pˆ 1  0.227
pˆ 2  0.170
pˆ  0.194
Data Analysis for Two-way Tables
Example 9.1, Page 612: Here is a summary from Example 8.8 (page 589), where we
compared frequent binge drinking of men and women by examining the proportions
of each gender who engage in this activity.
Population
1 (men)
2 ( women)
Total
n
7,180
9,916
17,096
X
1,630
1,684
3,314
In this chapter we use a different summary of the data. Rather than recording just
the count of binge drinkers, we record counts for all outcomes in a two-way table.
Example 9.2; Page 612: Here is the two –way table classifying students by gender
and whether or not they are binge drinkers:
Two-way table for frequent binge drinking and gender
Gender
Frequent binge drinker
Yes
No
Total
Men
Women
Total
1,630
5,550
7,180
1,684
8,232
9,916
3,314
13,782
17,096
Example 9.3; Page 613: For the above example, the joint distribution of binge
drinking and gender is as follows:
Joint distribution of frequent binge drinking and gender
Gender
Frequent binge drinker
Yes
No
Total
Men
Women
Total
Because this is a distribution, the sum of the proportions should be 1. For this
example the sum is 1.001. The difference is due to roundoff error.
(72)
Marginal distributions:
Example 9.4; Page 614:
Marginal distribution of gender
Men
Women
Proportion
Often we prefer to use percents rather than proportions. Here is the marginal
distribution of gender described with percents:
Marginal distribution of gender
Men
Women
Percent
Example 9.5, Page 614: The following table gives the marginal distribution of
frequent binge drinking:
Marginal distribution of frequent binge drinking
Yes
Percent
(73)
No
Conditional distributions
Example 9.7; Page 616: For women, the conditional distribution of the frequent
binge-drinking in terms of percents is
Conditional distribution of frequent binge drinking for women
Yes
No
Percent
Conditional distribution of frequent binge drinking for men
Yes
No
Percent
Comparing the conditional distributions reveals the nature of the association
between gender and frequent binge drinking. In this set of data the men are more
likely to be frequent binge drinkers than the women.
The following bar graphs compare the women and men percents.
(74)
Problem 9.1; Page 637: The census Bureau collects data on years of school
completed by Americans of different ages. The following table gives the years of
education for three different age groups. People under the age of 25 are not included
because many have not yet completed their education. Note that the unit of measure
for each entry in the table is thousands of persons.
Years of school completed, by age (thousands of persons)
Age Group
Education
Did not complete high
school
Completed high school
College, 1 to 3 years
College, 4 or more years
Total
25 to34
35 to 54
9,152
55 and
over
16,035
5,325
14,061
11,659
10,342
41,338
30,512
24,070
19,926
19,878
73,028
18,320
9,662
8,005
52,022
56,451
41,247
38,225
166,438
(a) Give the joint distribution of education and age for this table.
(b) What is the marginal distribution of age?
(c) What is the marginal distribution of education?
 (a) The joint distribution in percents is given as follows:
Age Group
Education
Did not complete high
school
Completed high school
College, 1 to 3 years
College, 4 or more years
25 to34
35 to 54
55 and
over
(b) The marginal distribution of age in percents is
25-34
35-54
55 and over
(c) The marginal distribution of education in percents is
Did not complete high school
Completed high school
College, 1 to 3 years
College, 4 or more years
(75)
Total
Problem 9.2; Page 638: Refer to the previous exercise. Find the conditional
distribution of education for each of the three age categories. Make a bar graph for
each distribution and summarize their differences and similarities.

The conditional distributions in percents are given as follows:
Age Group
Education
Did not complete high
school
Completed high school
College, 1 to 3 years
College, 4 or more years
25 to34
35 to 54
Bar Graphs:
(76)
55 and
over
Problem 9.3; Page 638: Refer to the previous exercise. Compute the conditional
distribution of age for each of the four education categories. Summarize the
distributions graphically and write a short paragraph describing the distributions
and how they differ.
 The conditional distributions in percents are given as follows:
Age Group
Education
Did not complete high
school
Completed high school
College, 1 to 3 years
College, 4 or more years
25 to34
35 to 54
55 and
over
Bar Graphs:
Simpson’s Paradox: An association or comparison that holds for all of several
groups can be reverse when the data are combined to form a single group. This
reversal is called Simpson’s Paradox.
(77)
2 ( Chi –Square) Test for Independence
Assumption: An srs of size n is chosen from a population. Two categorical variables
are measured for each individual and the sample data is classified by an rc
contingency table.
H0 : Row and column categories are independent.
Ha : Row and column categories are dependent.
TEST STATISTIC:
2
= 
(Oij  Eij ) 2
Eij
;
Eij =
(ri )( c j )
n
 02 is the observed value of 2.
The statistic follows chi-square distribution with df = (r-1)(c-1).
P-VALUE
p-value = P(2   02 )
NOTE: Chi – square distribution is a right skewed distribution and has a single
parameter called its degrees of freedom.
The right tail probabilities of chi-square distribution are given in Table F.
(78)
Example: In a study on gender bias, a simple random sample of 300 new employees
in the banking industry were classified according to “Gender” and as to “whether
or not they received a promotion after an appropriate period of time”
GENDER
MALE
FEMALE
TOTAL
YES
140 (O11)
48 (O12)
188 (r1)
NO
70 (O21)
42 (O22)
112 (r2)
TOTAL
210 (c1)
90 (c2)
300 (n)
PROMOTION
We wish to test
H0 : Gender and promotion are independent.
Ha : gender and promotion are dependent.(  = .05)
We will calculate expected frequencies under H0 ( i.e. under independence).
To see how Eij’s are obtained, consider E12, which is the expected number of female
employees who were promoted.
E12 = nP(a female is promoted)
= nP(Y and F) = nP(Y)P(F) = n(
In general,
Eij =
r1
c
(r )(c )
)( 2 ) = 1 2 [by independence]
n
n
n
(ri )( c j )
n
(79)
 02 
(140  131.6) 2 (48  56.4) 2 (70  78.4) 2 (42  33.6) 2
+


131.6
56.4
78.4
33.6
=
=4.7873
degrees of freedom(df)=(2-1)(2-1)=1
Therefore, .025 < p-value < .05.
Decision: p-value < .05, reject H0 .
There is sufficient evidence at  = .05 to conclude that gender and promotion are
not independent.
Note: The chi-square approximation is adequate for practical use when the average
expected cell count is 5 or greater and all individual expected counts are 1 or
greater, except in the case of 22 contingency tables. All four expected counts in a
22 table should be 5 or greater.
(80)
EXAMPLE: A random sample of 200 individuals from a community were cross
classified according to eye and hair colour.
HAIR COLOUR
Black
EYE CLOUR
Brunette
Blond
Red
Brown
25
41
9
3
Blue
4
38
3
34
Other
3
22
10
8
(a) Test at  =.05 if eye colour and hair colour are related.
(b) If H0 is rejected in (a), determine which cell contribute most to the decision.
Interpret this cell.
Minitab Output
EXAMPLE: Consider the eye and hair colour example. Its Minitab output is given
below.
MTB > chis c1-c4
Chi-Square Test: BLACK, BRUNETTE, BLOND, RED
Expected counts are printed below observed counts
1
BLACK BRUNETTE
25
41
12.48
39.39
BLOND
9
8.58
RED
3
17.55
Total
78
2
4
12.64
38
39.90
3
8.69
34
17.77
79
3
3
6.88
22
21.72
10
4.73
8
9.68
43
Total
32
101
22
45
200
Chi-Sq = 12.560 + 0.066 + 0.021
5.906 + 0.090 + 3.726
2.188 + 0.004 + 5.872
DF = 6, P-Value = 0.000
1 cells with expected counts less
(i)
+ 12.063 +
+ 14.810 +
+ 0.290 = 57.594
than 5.0
H0 :
Ha :
(81)
Continued:
(82)
Chi-Square Test for Comparing Several Populations
Assumption: Select independent srs’s from each of c populations of size n1 , n2 , . . .
nc . Classify each individual in a sample according to a categorical response variable
with r possible values. There are c different probability distributions, one for each
population.
H0 : The distributions of the response variable are the same in all c populations.
Ha : These c distributions are not all same.
TEST STATISTIC:
2
= 
(Oij  Eij ) 2
Eij
;
Eij =
(ri )( c j )
n
 02 is the observed value of 2.
The statistic follows chi-square distribution with df = (r-1)(c-1).
P-VALUE
p-value = P(2   02 )
(83)
Example: Suppose an investigator wishes to determine if there is a difference in the
proportion of male and female adults in Canada who favour gun control. He selects
independent simple random samples from each population and obtains the
following results:
GENDER
MALE
FEMALE
TOTAL
OPINION
YES
60 (O11)
105 (O12)
165 (r1)
NO
40 (O21)
45 (O22)
85 (r2)
TOTAL
100 (c1)
150 (c2)
250 (n)
We apply chi-square test for homogeneity.
H0 : The two populations are homogeneous with respect to opinion on gun control.
Ha: The two populations are not homogeneous with respect to opinion on gun
control.
We find the expected frequencies when H0 is true.
To see how Eij are calculated, let’s look at E12, which is the expected number of
females who favour gun control
E12 = (sample size from the female population)pYF
r
 c2 ( 1 )
n
=
(estimate)
(r1 )(c 2 )
n
Notice that E12 =
(ri )( c j )
(r1 )(c 2 )
, In general, Eij =
n
n
(84)
 02 =
There is insufficient evidence at  = .05 to conclude that the male and female
populations are not homogeneous with respect to opinion on gun control.
(85)
Note: The above problem can also be done using the test on two proportions.
MALE
FEMALE
YES
60
105
NO
40
45
100
150
TOTAL
H0 : p1 = p2
p̂1 =
SEDp =
z=
pˆ (1  pˆ )(
Ha: p1p2
p̂ =
p̂ 2 =
1
1
 ) =
n1 n2
.6  .7

.0612
p-value = 2 P(Z 1.63) =
p-value > .05, Do not reject H0.
There is insufficient evidence at  = .05 to conclude that proportions of males and
females are not same on their opinions on gun control.
(86)
EXAMPLE: Students often claim that the distribution of grades varies depending
on the subject. From past records a university registrar chose random samples of
grades from selected first year courses in English, Mathematics and Biology. Care
was taken to exclude any student who may have taken two or more of these courses.
The distribution of grades for each sample is given below.
ENGLISH
MATHEMATICS
BIOLOGY
GRADE
A
5
8
13
B
33
17
30
C
42
10
24
D
15
8
7
F
5
8
6
(a) Test at  =.05 if the distribution of grades differs among the three subjects.
(b) If H0 is rejected in (a), determine which cell contribute most to the decision.
Interpret this cell.
Minitab Output
MTB > chis c1-c3
Chi-Square Test: ENGLISH, MATHEMATICS, BIOLOGY
Expected counts are printed below observed counts
1
ENGLISH MATHEMAT
5
8
11.26
5.74
BIOLOGY
13
9.00
Total
26
2
33
34.63
17
17.66
30
27.71
80
3
42
32.90
10
16.78
24
26.32
76
4
15
12.99
8
6.62
7
10.39
30
5
5
8.23
8
4.19
6
6.58
19
Total
100
51
80
231
Chi-Sq =
3.477 + 0.890 + 1.773
0.077 + 0.025 + 0.190
2.517 + 2.739 + 0.205
0.312 + 0.286 + 1.106
1.265 + 3.452 + 0.051
DF = 8, P-Value = 0.019
1 cells with expected counts less
+
+
+
+
= 18.363
than 5.0
(87)
Continued:
(i)
H0:
Ha:
(88)
ONE WAY ANALYSIS OF VARIANCE (CHAPTER 12)
Formula for sample variance (Review)
If x1 , x2 , …xn is a set of n observations; then sample variance s2 is given b
s2 =
(x
i
 x)2
n 1
( x1  x ) 2  ( x2  x ) 2  ...  ( xn  x ) 2 SSTO
=
n 1
n 1

SSTO is called the total sum of squares and an alternative formula for SSTO is
n
SSTO =
 (x
i
 x)
2
=
1
x
2
i

(  xi ) 2
n
The following Sum of Squares are used in One-way Analysis of Variance.
(i)
SST (or SSTO) = Total Sum of Squares.
(ii)
SSG = Group Sum of Squares.
(iii)
SSE= Error Sum of Squares.
To explain these SS’s we consider the following example.
EXAMPLE: A school of education conducted a study to compare four methods of
teaching reading comprehension (method A is the standard method). Twenty four
grade 5 students of similar ability were randomly assigned to the four methods.
After the four months of instruction, a standardized reading test (SRT) was given to
the students. Their scores are given below.
METHOD A: 42
42
43
45
METHOD B: 44
45
46
46
47
48
METHOD C:46
47
48
48
48
51
METHOD D: 40
41
42
43
44
44
(89)
45
45
 We call the above samples as Sample 1, Sample 2, Sample 3 and Sample 4.
Sample 1: 42
42
43
45
n1 = 4,  x =172, x1 = 43, SSTO =6, s12 = 6/3 = 2.
Sample 2: 44
45
46
n2 = 6,  x =276 , x1 =
Sample 3: 46
47
Sample 4: 40
41
47
48
276
 46 , SSTO2 =10, s 22 = 10/5 = 2.
6
48
n3 = 6,  x =288 , x1 =
46
48
48
51
288
 48 , SSTO3 =14, s 32 = 14/5 = 2.8
6
42
n4 = 8,  x =344 , x 4 =
43
44
44
44
45
45
344
 43 , SSTO4 =124, s 42 = 24/7 = 3.4286
8
COMBINED SAMPLE:
n=24,
x =
1080
162
 45 , SSTO = 162, s2 =
=7.0435
24
23
In the above example
SSTO (or SST) = 162
This is called total sum of squares and it measures the total variation of the sample.
SSG= n1 ( x1  x ) 2 + n2 ( x2  x ) 2 + n3 ( x3  x ) 2 + n4 ( x4  x ) 2
=
=108
This is called Group Sum of Squares and it measures the variation between groups.
(90)
SSE = SSTO1+ SSTO2+ SSTO3+ SSTO4
=
=
This is called error sum of squares and it measures the variation within groups.
Note: In one-way ANOVA the following result always hold:
SST (or SSTO) = SSG + SSE
In the above example
SST=162, SSG= 108, SSE=54.
Thus, 108+54=162.
DEGREES OF FREEDOM The above SS’s have associated degrees of freedom
denoted by DFT, DFG and DFE. If total number of values in the combined sample
are N and number of groups are I; then,
DFT = N-1, DFG= I-1, DFE=N-I
The following result always hold:
DFT=DFG+DFE
MEAN SQUARES we would need the following mean squares in one-way ANOVA.
(i) MSE =
SSE
DFE
(ii) MSG =
(91)
SSG
DFG
TESTING HYPOTHESES IN ONE-WAY ANOVA
ASSUMPTION: The data arise as “I” independent simple random samples from I
normal distributions with equal variances 2.
HYPOTHESES:
H0 : 1 =2 =….I
Ha : not all of the I are same.
TEST STATISTIC: is given by the following analysis of variance table.
DEGREES
OF FREEDOM
SOURCE
SUM OF
SQUARES
MEAN
SQUARE
GROUPS
I-1
SSG
MSG
ERROR
N-I
SSE
MSE
TOTAL
N-1
SST
The statistic
F=
MSG
MSE
F
F=
MSG
MSE
has F-distribution with df = (I-1,N-I).
P-VALUE
p-value = P (F  F0 ); where F0 is the observed value of the F statistic.
The upper tail probabilities of F distribution are given in Table E.
(92)
EXAMPLE: The analysis of variance table for the previous example is
SOURCE
DEGREES
OF FREEDOM
SUM OF
SQUARES
MEAN
SQUARE
GROUPS
3
SSG=108
MSG=
ERROR
20
SSE=54
MSE=
TOTAL
23
SST=162
F
F=
df =(3,20)
F-CURVE WITH df =(3,20)
P(F8.10) = .001
Therefore, p-value = P(F  13.33) < .001
Decision: Reject H0.
There is sufficient evidence at  = .05 to conclude that the mean SRT scores differ
for students taught by the different teaching methods.
(93)
Minitab Output
MTB > stac c1-c4 c5;
SUBC> subs c6.
MTB > onew c5 c6
One-way ANOVA: C5 versus C6
Analysis of Variance for C5
Source
DF
SS
MS
C6
3
108.00
36.00
Error
20
54.00
2.70
Total
23
162.00
Level
1
2
3
4
N
4
6
6
8
Pooled StDev =
Mean
43.000
46.000
48.000
43.000
1.643
StDev
1.414
1.414
1.673
1.852
F
13.33
P
0.000
Individual 95% CIs For Mean
Based on Pooled StDev
-----+---------+---------+---------+(------*------)
(-----*-----)
(-----*-----)
(----*----)
-----+---------+---------+---------+42.5
45.0
47.5
50.0
(94)
EXAMPLE: Below is a partially completed One –Way ANOVA Table.
SOURCE
DF
SS
MS
GROUPS
-
130
65
ERROR
-
580
29
TOTAL
-
-
F
(a) Fill in the missing values.
(b) State H0 and Ha .
(c) Conduct a test of H0 and Ha at  =.05.

The coefficient of determination: We define the coefficient of determination as
SSG
SST
For the above problem the value of R2 or R –square is 0.183. This result says that
the FIT part of the model (that is, differences among means of the groups) accounts
for 18.3% of the total variation in the data.
R2 =
(95)
EXAMPLE: Below is a partially completed One-Way ANOVA table.
SOURCE
DF
SS
MS
F
TREATMENTS
-
-
-
-
ERROR
-
600
25
TOTAL
27
1500
(a) Fill in the missing entries in the table.
(b) State H0 and Ha.
(c) Conduct a test of H0 and Ha at  = .01.
(96)
PROBLEM 12.23(b)
: Many studies have suggested that there is a link between
PAGE787
exercise and healthy bones. Exercise stresses the bones and this causes them to get
stronger. One study examined the effect of jumping on the bone density of growing
rats. There were three treatments: a control with no jumping, a low-jump condition
( the jump height was 30 centimeters), and a high-jump condition (60 centimeters).
After 8 weeks of 10 jumps per day, 5 days per week, the bone density of the rats
( expressed in mg/cm3) was measured. Here are the data:
GROUP
CONTROL 611
LOW JUMP 635
HIGH JUMP 650
621
605
622
BONE DENSITY (mg/cm3)
614 593 593 653 600 554 603 569
638 594 599 632 631 588 607 596
626 626 631 622 643 674 643 650
Run the analysis of variance. Report the F statistic with its degrees of freedom and
the p-value. What do you conclude?
Results for: Ex12_023.mtp
MTB > onew c4 c5
One-way ANOVA: Bone Density versus Treatment
Analysis of Variance for Bone Den
Source
DF
SS
MS
Treatmen
2
7434
3717
Error
27
12580
466
Total
29
20013
Level
1
2
3
N
10
10
10
Pooled StDev =
Mean
601.10
612.50
638.70
21.58
StDev
27.36
19.33
16.59
F
7.98
P
0.002
Individual 95% CIs For Mean
Based on Pooled StDev
-------+---------+---------+--------(------*------)
(------*------)
(------*------)
-------+---------+---------+--------600
620
640
(97)
EXAMPLE: The data below gives the strength measurements for a type of concrete
when cured at three different temperatures. The water/cement ratio was held
constant throughout the experiment.
Temperature A 46.3 43.7 51.2 49.6 48.8
Temperature B 48.6 52.3 50.9 53.6 55.7
Temperature C 45.1 46.7 41.8 40.4 42.6
To compare the average strength at the various temperatures, a One-Way ANOVA
was conducted using Minitab. Use the following output to answer the questions
below.
MTB > stac c1-c3 c4;
SUBC> subs c5.
MTB > onew c4 c5
One-way ANOVA: C4 versus C5
Analysis of Variance for C4
Source
DF
SS
MS
C5
2
198.10
99.05
Error
12
89.60
7.47
Total
14
287.70
Level
1
2
3
Pooled
(a)
(b)
(c)
N
5
5
5
Mean
47.920
52.220
43.320
StDev
2.949
2.686
2.547
F
13.27
P
0.001
Individual 95% CIs For Mean
Based on Pooled StDev
---------+---------+---------+------(------*-----)
(------*-----)
(-----*------)
---------+---------+---------+------44.0
48.0
52.0
StDev =
2.733
State H0 and Ha
State the p-value and use it to conduct the test. (=.05)
Find the MSE and the pooled estimate of . How are these two numbers
related.
(98)
Example: We use one-way Anova to compare the average strength at the various
temperature for the following three random samples:
Temperature A: 46.3 47.3
Temperature B: 48.6 52.3
Temperature C: 45.1 46.7

SSTO1 =
SSTO2 =
SSTO3 =
SSE =
SSG =
SST=
(99)
Continued:
ANOVA TABLE
Concluding Sentence:
(100)