Download Chapter 2 - People Server at UNCW

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Analysis of variance wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Design & Analysis of Experiments 8E 2012 Montgomery
STT 511-STT411:
DESIGN OF EXPERIMENTS AND
ANALYSIS OF VARIANCE
Dr. Cuixian Chen
Chapter 2
Chapter 2: Some Basic Statistical Concepts
1
Review of STT215: Chapter 3
3.1 Design Of Experiments
(Outline of a randomized designs)
3
Completely randomized experimental designs: Individuals are randomly
assigned to groups, then the groups are randomly assigned to treatments.
Example 3.13, page 179
What are the effects of repeated exposure to an advertising message (digital
camera)? The answer may depend on the length of the ad and on how often it is
4
repeated. Outline the design of this experiment with the following information.



Subjects: 150 Undergraduate students.
Two Factors: length of the commercial (30 seconds and 90 seconds – 2 levels) and
repeat times (1, 3, or 5 times – 3 levels)
Response variables: their recall of the ad, their attitude toward the camera, and
their intention to purchase it. (see page 187 for the diagram.)
HWQ: 3.18,
3.30(b),3.32
3.1 Design Of Experiments (Block designs)
In a block, or stratified, design, subjects are divided into groups, or blocks,
prior to experiments to test hypotheses about differences between the groups.
5
The blocking, or stratification, here is by gender (blocking factor).
EX3.19
Ex: 3.17 (p182), 3.18
HWQ: 3.47(a,b), 3.126.
3.1 Design Of Experiments (Matched pairs designs)
6 Matched
pairs: Choose pairs of subjects that are closely matched—e.g., same sex,
height, weight, age, and race. Within each pair, randomly assign who will receive
which treatment.
It is also possible to just use a single person, and give the two treatments to this
person over time in random order. In this case, the “matched pair” is just the same
person at different points in time.
The most closely
matched pair
studies use
identical twins.
HWQ 3.120
STT511-411: Chapter 2 – Some Basic
Statistics Concepts
Design of Engineering Experiments
Chapter 2 – Some Basic Statistical Concepts
8

Describing sample data






Random samples
Sample mean, variance, standard deviation
Populations versus samples
Population mean, variance, standard deviation
Estimating parameters
Simple comparative experiments



The hypothesis testing framework
The two-sample t-test
Checking assumptions, validity
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Review of one sample inference in Stt215
9

Estimation of Parameters
1 n
y   yi estimates the population mean 
n i 1
1 n
S 
( yi  y ) 2 estimates the variance  2

n  1 i 1
2

Sampling Distribution
y
Z 
~ N (0,1), if σ if is known.
/ n
y
t
~ t (df  n  1), if σ if is unknown.
s/ n
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Normal and T distribution
10
When n is very large, s is a very good estimate of  and the
corresponding t distributions are very close to the normal distribution.
The t distributions become wider for smaller sample sizes, reflecting the
lack of precision in estimating  from s.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
How can we use computer to help us understand
the distributions?
11

Introducing R.
1.
Where to find? Google “R”->the first link.
2.
Download and install CRAN package.
3.
Then we have

How do we use it?
1.
Assign value to a variable x: x=3; or x<-3;
2.
A sequence of numbers: 1:5; or 6:3;
3.
A vector: x=c(4,5,6); or x=4:6; , then x[2]=5.
4.
loop: for (i in 1:5) {print(i)};
5.
Average: mean(x);
6.
sum: sum(x);
Entry level of R in 10 mins.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Normal distribution in R
12

normal dist in R: dnorm(x, µ, σ), for density;

pnorm(x, µ, σ), for left tail probability: Pr(X<=x);

qnorm(per, µ, σ), for the quantile: given Pr(X<=x)=per and find x;

rnorm(N, µ, σ), for the random number generation.
Eg: use R to find the probabilities for N(µ=3, σ =4)
1.
P(x<4);
2.
P(x>2);
3.
P(1<X<4) ;
4.
95th percentile
Note: if X ~ N( µ,σ2) , then Z=(X-µ)/σ ~ N(0,1), which is standard normal
distribution.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
T distribution in R
13

T-distribution in R:

dt(x,df), for density;

pt(x,df), for left tail probability: Pr(X<=x);

qt(per,df), for the quantile: given Pr(X<=x)=per and find x;

rt(N,df), for the random number generation.
Eg: use R to find the following probabilities for t(df=6)
1.
P(x<4);
2.
P(x>2);
3.
P(1<X<4)
4.
95th percentile.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
One sample confidence interval and
hypothesis testing
Review: Confidence levels
Confidence intervals contain the population mean  in C% of samples.
Different areas under the curve give different confidence levels C.
z*:
z* is related to the chosen
confidence level C.

C
C is the area under the standard
normal curve between −z* and z*.

The one sample Z-confidence
interval is thus:
x  (z*) 
n
−z*
z*
Example: For an 80% confidence
level C, 80% of the normal curve’s
area is contained in the interval.
Review: 5 Steps for Hypothesis testing
16
1.
State H0 and Ha
2.
State the level of significance (Usually α is 5% ).
3.
Calculate the test statistic, ASSUMING THE NULL HYPOTHESIS IS TRUE
4.
5.
Find the P-value, that is the probability (assuming H0 is true) that the
test statistic would take a value as extreme as or more extreme than
the actually observed (in the direction of Ha).
Draw Conclusion:
If P-value ≤ α, then we reject H0 (Enough evidence).
If P-value >α, then we do not reject H0 (No Enough evidence).
Note: The two possible conclusions are rejecting or not rejecting H0.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
P-value in one-sided and two-sided tests
One-sided (onetailed) test
Two-sided (twotailed) test
To calculate the P-value for a two-sided test, use the symmetry of the normal
curve. Find the P-value for a one-sided test, and double it.
17
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Review: Find P-value
The P-value is the area under the sampling distribution for values
at least as extreme, in the direction of Ha, as that of our random
sample.
Use R we just learn to find the p-value:
e.g. H0 : µ = 2.6 hours verse Ha : µ < 2.6 hours gives test statistic
Z=-1.6. Q: Find the p-value.
Sampling
distribution
σ/√n
x
µ
defined by H0
Example 1: One-sample Z-test
A test of the null hypothesis H0 : µ = µ0 gives test statistic Z=-1.6
a) What is the P-value if the alternative is Ha : µ > µ0 ?
b) What is the P-value if the alternative is Ha : µ < µ0 ?
c) What is the P-value if the alternative is Ha : µ ≠ µ0 ?
Example 1 (cont.): One-sample Z-test
A test of the null hypothesis H0 : µ = µ0 gives test statistic Z=2.1
a) What is the P-value if the alternative is Ha : µ > µ0 ?
b) What is the P-value if the alternative is Ha : µ < µ0 ?
c) What is the P-value if the alternative is Ha : µ ≠ µ0 ?
Example 2: One sample Z-test or
One sample Z-Confidence Interval
21



The National Center for Health Statistics reports that the mean
systolic blood pressure for males 35 to 44 years of age is 128
with a population SD=15. The medical director of a company
looks at the medical records of 72 company executives in this
age group and finds that the mean systolic blood pressure in this
sample is 126.07.
1) Is this evidence that executives blood pressures are different
from the national average?
2) Find the 95% confidence interval for the average SBP of all
company executives.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Example 2: One sample Z-test or
One sample Z-Confidence Interval
22
Answer: Hypothesis: H0 : µ = 128 v.s. Ha : µ≠128.
Test statistics
x  126.07
z
  15
α = 5%
n  72
x   126.07  128

 1.09
 n
15 72
P-value= 2*pnorm(-1.09)= 0.2757131. Conclusions: ….

The 95% confidence interval for the average SBP of all company
executives is: Z*=qnorm(0.975)
15
15

 


(
126
.
07

1
.
96
,
126
.
07

1
.
96
)  (122.61, 129.53)
, X  1.96
 X  1.96

72
72
n
n


The conclusions from two-sided HT (α=5%) and CI (95%) are
consistent
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Example 3: One sample t-test or
One sample t-Confidence Interval
23



The National Center for Health Statistics reports that the mean
systolic blood pressure for males 35 to 44 years of age is 128.
The medical director of a company looks at the medical records
of 72 company executives in this age group and finds that the
mean systolic blood pressure in this sample is 126.07 with sample
SD 15.
1) Is this evidence that executives blood pressures are different
from the national average?
2) Find the 95% confidence interval for the average SBP of all
company executives.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Example 3: One sample t-test or
One sample t-Confidence Interval
24
Answer: Hypothesis: H0 : µ = 128 v.s. Ha : µ ≠ 128.
α = 5%
Test statistics X  126.07 s = 15 n = 72
t
X   126.07  128

 1.09; df = 72 - 1 = 71
S/ n
15 / 72
P-value= 2*pt(-1.09, df=71)=0.2793988. Conclusions: ….
The 95% confidence interval for the average SBP of all company
executives is
t*=qt(0.975, 71)
( x  t ( n1)
*
S
S
s
s
*
, x  t ( n1) )  ( X  1.99 , X  1.99 )
n
n
n
n
 (126.07  1.99
15
15
,126.07  1.99
)  (122.55, 129.59)
72
72
The conclusions from two-sided HT (α=5%) and CI (95%) are consistent
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
One sample test
(Shall we use Z-test or T-test ??)
25
Example 4: A new medicine treating cancer was introduced to the
market decades ago and the company claimed that on average
it will prolong a patient’s life for 5.2 years. Suppose the SD of all
cancer patients is 2.52. In a 10 years study with 64 patients, the
average prolonged lifetime is 4.6 years.


1) With normality assumption, do the 10-year study’s data show
a different average prolonged lifetime?
2) Find the 95% confidence interval for the average prolonged
lifetime for all patients.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
One sample test
(Shall we use Z-test or T-test ??)
26
Example 5: A new medicine treating cancer was introduced to the
market decades ago and the company claimed that on average
it will prolong a patient’s life for 5.2 years. In a 10 years study
with 20 patients, the average prolonged lifetime is 4.7 years with
sample SD 2.50.


1) With normality assumption, do the 10-year study’s data show
a different average prolonged lifetime?
2) Find the 95% confidence interval for the average prolonged
lifetime for all patients.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Review: Link between confidence level and margin of
error for one-sample z-CI:
The margin of error depends on z.
MOE  z  
Higher confidence C implies a larger
margin of error m (thus less precision
in our estimates).
C
A lower confidence level C produces a
smaller margin of error m (thus better
precision in our estimates).
m
−z*
m
z*
n
Example 6: Finding sample size


In a clinical study with certain # of patients, a new medicine
can on average prolong 4 years of life. Suppose the SD of all
cancer patients is 0.75.
Q: How large a sample pf cancer patients would be needed
to estimate the mean within ±0.1 years with 90% confidence?
Z*=1.645; MOE=(1.645)*(0.75)/sqrt(n)=0.1; so n=(1.645*0.75/0.1)^2=152.2139.
We will take n=153.
Confidence intervals to test hypotheses
For a level a two-sided significance test:
Rejects H0:  = 0 exactly when the hypothesized value 0 falls
outside a level (1-a)100%confidence interval for  .
In a two-sided test,
C = 1 – α.
C confidence level
α significance level
α /2
α /2
One sample test in R
30

One-sample t-test;

One-sample Z-test

One-sample Z-CI’s and One-sample t-CI’s .
For H0: mu=32, v.s. Ha: mu<32
## suppose that these are the houses prices of 5
randomly selected house from Wilmington
x<-c(20,25,28,33,37);
mean(x)
var(x)
sd(x)
##
## We want to test the average house price is less than 32
## one sample t test by hand in R ###########
## For H0: mu=32, v.s. Ha: mu<32
n<-length(x)
t.val<-(mean(x)-32)/(sd(x)/sqrt(n))
df.dat<-n-1;
p.value<-pt(t.val,df.dat);
print(p.value);
## one sample test by t.test in R
## For H0: mu=32, v.s. Ha: mu<32
t.test(x,alternative="less",mu=32)
##############################
## For H0: mu=32, v.s. Ha: mu>32
t.test(x,alternative="greater",mu=32)
## For H0: mu=32, v.s. Ha: mu≠32
t.test(x,mu=32) #for Ha: mu not equal to 32
## 95% CI in R ##########
LB<-mean(x)-qt(.975,4)*sd(x)/sqrt(n);
UB<- mean(x)+qt(.975,4)*sd(x)/sqrt(n);
print(c(LB, UB));
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
One sample test in SAS
31
title 'One-Sample t Test';
proc ttest data=value h0=32 sides=2; /*default*/
data value;
/* 3 options for sides 2=not equal */
input value @@;
/* L =less than U=greater than*/
var value;
datalines;
run;
20
25
proc ttest data=value h0=32 sides=U;
28
var value;
33
run;
37
;
proc ttest data=value h0=32 sides=L;
run;
var value;
proc ttest data=value h0=32;
run;
var value;
run;
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Matched-pair sample (one-sample) confidence
interval and hypothesis testing
Matched pairs t procedures
for dependent sample
Subjects are matched in “pairs” and

outcomes are compared within each unit

Example: Pre-test and post-test studies look at data collected on
the same sample elements before and after some experiment is
performed.

Example: Twin studies often try to sort out the influence of genetic
factors by comparing a variable between sets of twins.
We
perform hypothesis testing on the difference in each unit
Matched pairs (one-sample)
The variable studied becomes Xdifference = (X1 − X2). The null
hypothesis of NO difference between the two paired groups.
H0: µdifference= 0 ; Ha: µdifference>0 (or <0, or ≠0)
When stating the alternative, be careful how you are calculating
the difference (after – before or before – after).
Conceptually, this is not different from tests on one population.
Matched Pairs (one-sample)



If we take After – Before, and we want to show that the “After
group” has increased over the “Before group”
Ha:  > 0
“After group” has decreased
xdiff   diff
t diff 
Ha:  < 0
sdiff n
The two groups are different
Ha:  ≠0
Example 4: Matched Pairs t-test
Many people believe that the moon influences the actions of some individuals. A
study of dementia patients in nursing homes recorded various types of
disruptive behaviors every day for 12 weeks. Days were classified as moon
days and other days. For each patient the average number of disruptive
behaviors was computed for moon days and for other days. The data for 5
subjects whose behavior were classified as aggressive are presented as
below:
Moon days
Other days
3.33
0.27
3.67
0.59
2.67
0.32
3.33
0.19
3.33
1.26
We want to test whether there is any difference in aggressive behavior on
moon days and other days.
Example 4: Matched Pairs t-test
Many people believe that the moon influences the actions of some individuals. A
study of dementia patients in nursing homes recorded various types of
disruptive behaviors every day for 12 weeks. Days were classified as moon
days and other days. For each patient the average number of disruptive
behaviors was computed for moon days and for other days. The data for 5
subjects whose behavior were classified as aggressive are presented as
below:
Moon days
Other days
Difference
3.33
0.27
3.06
3.67
0.59
3.08
2.67
0.32
2.35
3.33
0.19
3.14
3.33
1.26
2.07
We want to test whether there is any difference in aggressive behavior on
moon days and other days.
Answer to Example 4
38
Let difference = aggressive behavior on moon days and other days.





H 0 : d  0 verses
H a : d  0
,
a  0.05
t-statistic=12.377, df=5-1=4,
p-value=2.449*10^(-4).
Reject H0 at 5% level.
Enough evidence to conclude that there is any difference in
aggressive behavior on moon days and other days
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Does lack of caffeine increase depression?
Individuals diagnosed as caffeine-dependent are deprived of caffeine-rich foods
and assigned to receive daily pills. Sometimes, the pills contain caffeine and other
times they contain a placebo. Depression was assessed.
 Q:
Does lack of caffeine increase depression?
There are 2 data points
for each subject, but
we’ll only look at the
difference. The sample
distribution appears
appropriate for a t-test.
Depression Depression
Subject with Caffeine with Placebo
1
5
16
2
5
23
3
4
5
4
3
7
5
8
14
6
5
24
7
0
6
8
0
3
9
2
15
10
11
12
11
1
0
Does lack of caffeine increase depression?
For each individual in the sample, we have calculated a difference in depression score
(placebo minus caffeine).
There were 11 “difference” points, thus df = n − 1 = 10.
We calculate that
x= 7.36; s = 6.92
H0: difference = 0 ; H0: difference > 0

x 0
7.36
t

 3.53
s n 6.92 / 11
For df = 10, p-value=0.0027.
(1)Since p-value < 0.05, reject H0.
Depression Depression Placebo Subject with Caffeine with Placebo Cafeine
1
5
16
11
2
5
23
18
3
4
5
1
4
3
7
4
5
8
14
6
6
5
24
19
7
0
6
6
8
0
3
3
9
2
15
13
10
11
12
1
11
1
0
-1
(2) We have enough evidence to conclude that:
Caffeine deprivation causes a significant increase in depression.
Two independent samples confidence interval
and hypothesis testing
Two (independent) sample scenario
42
Portland Cement Formulation (page 26)
An engineer is studying the formulation of a
Portland cement mortar. He has added a
polymer latex emulsion during mixing to
determine if this impacts the curing time and
tension bond strength of the mortar.
The experimenter prepared 10 samples of the
original formulation and 10 samples of the
modified formulation.
Q: How many factor(s)? How many levels?
Factor: mortar formulation;
Levels: two different formulations as two treatments or as two levels.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Graphical View of the Data
Dot Diagram, Fig. 2.1, pp 26
43
Q: Visually, do you see any difference between these two samples?
Q: If yes, do you see large, modest or very small difference?
Q: How to compare the difference between these two samples?
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
The Hypothesis Testing Framework for two
sample t-test
44



Statistical hypothesis testing is a useful framework for many
experimental situations
Origins of the methodology date from the early 1900s
We will use a procedure known as the two-sample Z-test and
two-sample t-test.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Another example: If you have a large sample, a
histogram may be useful
45
Graphical description of variability: with 200 observations
Noise: called experimental error.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Box Plots, Fig. 2.3, pp. 28
46
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Inferences about the differences in means,
Randomized designs
47



The Hypothesis Testing Framework:
Sampling from a normal distribution
Statistical hypotheses: H :   
0
1
2
H1 : 1   2
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
47
Errors in Hypothesis testing
48



If the Null hypothesis is rejected, when it is true, a type I error
occurred:
α = Pr(Type I error) = Pr(Reject H0 | H0 is true).
α is also called significance level.

If the Null hypothesis is not rejected, when it is false, a type II
error occurred:
β = Pr(Type II error) = Pr(Fail to reject H0 | H0 is false)

Power = 1- β = Pr(reject H0 | H0 is false)

Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Portland Cement
Summary Statistics (pg. 38)
49
If we want to test: H 0 : 1  2
H1 : 1   2
Modified Mortar
Unmodified Mortar
“New recipe”
“Original recipe”
y1  16.76
y2  17.04
S12  0.100
S22  0.061
S1  0.316
S2  0.248
n1  10
n2  10
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Portland Cement Example
50
If we want to test: H 0 : 1  2
H1 : 1   2

We will consider three cases for this example:

Case 1: Assume σ1 and σ2 are known: let σ1 = σ2 = 0.30.

Case 2: Assume σ1 and σ2 are unknown, and σ1 = σ2.

Case 3: Assume σ1 and σ2 are unknown, and σ1 ≠ σ2.

Then Case 1 will give two-sample Z-test; Case 2 will give twosample (pooled) t-test, and Case 3 will give two-sample t-test.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Two-Sample Z-Test: if σ1 and σ2 are known
51
Case 1: Assume σ1 and σ2 are known: let σ1 = σ2 = 0.30.
Use the sample means to draw inferences about the population means
y1  y2  16.76  17.04  0.28
Difference in sample means
Standard deviation of the difference in sample means
 
2
y
2
n
, and 
2
y1  y2
=
 12
n1

 22
n2
, y1 and y2 independent REALLY? (STT315)
This suggests a statistic:
Z0 
y1  y2
 12
n1

 22
n2
If the variances were known we could use the normal distribution as the basis of a test
Z0 has a N(0,1) distribution if the two population means are equal
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
If we knew the two variances how would we use Z0
to test H0?
52
Case 1: Assume σ1 and σ2 are known: let σ1 = σ2 = 0.30.

Suppose that σ1 = σ2 = 0.30. Then we can calculate
Z0 
y1  y2

2
1
n1





2
2
n2

0.28
0.28

 2.09
2
2
0.1342
0.3 0.3

10
10
How “unusual” is the value Z0 = -2.09 if the two population means
are equal?
It turns out that 95% of the area under the standard normal curve
(probability) falls between the values Z0.025 = 1.96 and - Z0.025 = 1.96. (that is: the critical value Z*=1.96.)
So the value Z0 = -2.09 is pretty unusual in that it would happen less
that 5% of the time if the population means were equal
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Standard Normal Table (see appendix) for critical values
53
Critical Value:
Z*=qnorm(0.975)
=1.959964
Z0.025 = 1.96
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Inferences about the differences in means,
Randomized designs Case 1: Assume σ1 and σ2 are known:
let σ1 = σ2 = 0.30.
54
So if the variances were known we would conclude that we should
reject the null hypothesis at the 5% level of significance
H 0 : 1  2
H1 : 1   2
and conclude that the alternative hypothesis is true.
This is called a fixed significance level test, because we compare the value
of the test statistic to a critical value (1.96) that we selected in advance
before running the experiment.
The standard normal distribution is the reference distribution for the test.
Another way to do this that is very popular is to use the P-value approach.
The P-value can be thought of as the observed significance level.
For the Z-test, it is easy to find the P-value.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Normal Table
Case 1: Assume σ1 and σ2 are known:
let σ1 = σ2 = 0.30.
55
Find the probability above Z0 = -2.09
from the table.
This is 1 – 0.98169 = 0.01832
Z0.025 = 1.96
The P-value is twice this probability, or
0.03662.
So we would reject the null hypothesis
at any level of significance that is less
than or equal to 0.03662.
Typically 0.05 is used as the cutoff.
In R, we use
2*pnorm(-2.09) =0.0366178
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Two-sample t –Test if σ1 and σ2 are unknown
56





The two-sample Z-test just described would work perfectly if we
knew the two population variances.
Since we usually don’t know the true population variances, what
would happen if we just plugged in the sample variances?
The answer is that if the sample sizes were large enough (say both
n> 30 or 40) the Z-test would work just fine. It is a good largesample test for the difference in means.
But many times that isn’t possible (as Gosset wrote in 1908, “…but
what if the sample size is small…?).
It turns out that if the sample size is small we can no longer use the
N(0,1) distribution as the reference distribution for the test.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
How the Two-Sample t-Test Works:
57
Case 2: Assume σ1 and σ2 are unknown, and σ1 = σ2.
Use S12 and S 22 to estimate  12 and  22
The test statistic is y1  y2
The previous ratio becomes
S12y1 S22 y2
t0 

n1 1
n2
1
Sp
2
2
2
n
However, we have the case where n




2
11
2
Pool the individual sample variances:
2
2
(
n

1)
S

(
n

1)
S
1
2
2
S p2  1
n1  n2  2
df=n1 + n2 - 2
is an estimate of the common variance
Or call:
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
How the Two-Sample (pooled) t-Test Works:
Case 2: Assume σ1 and σ2 are unknown, and σ1 = σ2.
The test statistic is
t0 




y1  y2
1 1
Sp

n1 n2
df=n1 + n2 - 2
The denominator is called the
standard error of the difference
in means. SE(
).
Values of t0 that are very different from zero are consistent with
the alternative hypothesis
t0 is a “distance” measure-how far apart the averages are
expressed in standard deviation units
Notice the interpretation of t0 as a signal-to-noise ratio.
Chapter 2
Design & Analysis of Experiments 8E 2012 Montgomery
58
The Two-Sample (Pooled) t-Test
59
Case 2: Assume σ1 and σ2 are unknown, and σ1 = σ2.
(n1  1) S12  (n2  1) S22 9(0.100)  9(0.061)
S 

 0.081
n1  n2  2
10  10  2
2
p
S p  0.284
t0 
y1  y2
16.76  17.04

 2.20
1 1
1 1
Sp

0.284

n1 n2
10 10
The two sample means are a little over two standard deviations apart
Is this a "large" difference?
In R: 2*pt(-2.20, 18) = 0.04110859.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Two-Sample (Pooled) t-Test: a more general case
60
Case 2: Assume σ1 and σ2 are unknown, and σ1 = σ2.
If we would like to test a more general case, for example:
H0: µ1-µ2=10 v.s. H0: µ1-µ2≠10
Then the test statistic will be (see page 43):
t0 =
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
William Sealy Gosset (1876, 1937)
Gosset's interest in barley cultivation led
him to speculate that design of
experiments should aim, not only at
improving the average yield, but also at
breeding varieties whose yield was
insensitive (robust) to variation in soil and
climate.
Developed the t-test (1908)
Gosset was a friend of both Karl Pearson
and R.A. Fisher, an achievement, for each
had a monumental ego and a loathing for
the other.
Gosset was a modest man who cut short
an admirer with the comment that “Fisher
would have discovered it all anyway.”
61
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
The Two-Sample (Pooled) t-Test



We need an objective
basis for deciding how
large the test statistic t0
really is.
In 1908, W. S. Gosset
derived the reference
distribution for t0 … called
the t distribution.
Tables of the t distribution –
see textbook appendix
page 614.
t0 = -2.20
Critical Value:
t*=qt(0.975, 18)=2.100922.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
62
Critical Value:
t*=qt(0.975, 18)
=2.100922.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
63
The Two-Sample (Pooled) t-Test




A value of t0 between:
–2.101 and 2.101 is
consistent with equality of
means.
It is possible for the means
to be equal and t0 to either
exceed 2.101 or below
–2.101, but it would be a
“rare event” … leads to the
conclusion that the means
are different.
Could also use the P-value
approach.
t0 = -2.20
Critical Value:
t*=qt(0.975, 18)=2.100922.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
64
The Two-Sample (Pooled) t-Test
t0 = -2.20
Critical Value:
t*=qt(0.975, 18)=2.100922.




The P-value is the area (probability) in the tails of the t-distribution beyond -2.20 + the
probability beyond +2.20 (it’s a two-sided test).
The P-value is a measure of how unusual the value of the test statistic is given that the null
hypothesis is true.
The P-value the risk of wrongly rejecting the null hypothesis of equal means (it measures
rareness of the event).
The exact P-value in our problem is P = 0.042 (found from a computer).
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
65
Approximating the P-value




Our t-table only gives probabilities greater than positive values
of t. So take the absolute value of t0 = -2.20 or |t0|= 2.20.
Now with 18 degrees of freedom, find the values of t in the
table that bracket this value.
These are 2.101 < |t0|= 2.20 < 2.552. The right-tail
probability for t = 2.101 is 0.025 and for t = 2.552 is 0.01.
Double these probabilities because this is a two-sided test.
Therefore the P-valuemust lie between these two probabilities, or

0.05 <P-value < 0.02

These are upper and lower bounds on the P-value.

We know that the actual P-value is 0.042.
Chapter 2
Design & Analysis of Experiments 8E
2012 Montgomery
66
Computer Two-Sample t-Test Results
67
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Checking Assumptions –
Normal Probability Plot (called QQ-plot)
68
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Importance of the t-Test
69


Provides an objective framework for simple comparative
experiments
Could be used to test all relevant hypotheses in a two-level
factorial design, because all of these hypotheses involve the
mean response at one “side” of the cube versus the mean
response at the opposite “side” of the cube. (See page 6-7)
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Confidence Intervals (See pg. 43)
70


Hypothesis testing gives an objective statement concerning
the difference in means, but it doesn’t specify “how
different” they are.
General form of a confidence interval
L    U where P( L    U )  1  a

The 100(1- α)% confidence interval on the difference in
two means:
y1  y2  ta / 2,n1  n2 2 S p (1/ n1 )  (1/ n2 )  1  2 
y1  y2  ta / 2,n1  n2 2 S p (1/ n1 )  (1/ n2 )
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Critical Value:
t*=qt(0.975, 18)
=2.100922.
Example, page 43-44:
71
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
The two sample t-test (with unequal variance)
Case 3: Assume σ1 and σ2 are unknown, and σ1 ≠ σ2.
The degrees of freedom v associated
with this variance estimate is
approximated using the
Welch–Satterthwaite equation
72
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
What if the Two Variances are Different?
73
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Example 2.1, page 48:
74
Case 3: Assume σ1 and σ2 are unknown, and σ1 ≠ σ2.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Example 2.1, page 48:
75

P-value = pt(-2.7354, 16.1955) =0.007274408.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Two sample t-test in R: Table 2.1
76

Two-sample (pooled) t-test, and two-sample t-test.
For H0: mu1=mu2, v.s. Ha: mu1≠mu2
OR: H0: mu_diff = 0, v.s. Ha: mu_diff ≠ 0
## Two sample t-test in R with t.test###############################
data.tab2.1<-read.table("http://people.uncw.edu/chenc/STT411/dataset%20backup/Tension-Bond.TXT ",
header=TRUE);
x<-data.tab2.1[,1];
y<-data.tab2.1[,2];
t.test(x,y,alternative ="two.sided", mu=0, var.equal = TRUE);
t.test(x,y,alternative ="two.sided", mu=0, var.equal = FALSE);
var.equal: If TRUE then the pooled variance is used to estimate the variance,
otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Check the normality assumption
77
########## check the normality assumption ######
qqnorm(x); qqline(x);
#### what does the qq plot do? ######
par(mfrow=c(2,2))
aa<-rnorm(100)
qqnorm(aa); qqline(aa);
bb<-rnorm(100,10,2)
qqnorm(bb); qqline(bb);
cc<-rexp(100)
qqnorm(cc); qqline(cc); dev.off(); ## to close the plotting window
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Other Chapter Topics
78


Hypothesis testing when the variances are known.
One sample inference (t and Z tests), by comparing to a specific
value.

Hypothesis tests on variances (F tests).

Paired experiments – this is an example of blocking. (chap 4)
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
How do we know whether the variance of two
samples are different?
79
So F0 follows F (n1-1,n2-1) distribution.
We call n1-1 the numerator df, and n2-1 the denominator df.
For two-sided F-test: p-value=2*(one-tail area). But how to decide which side to double?
First, find the center of Median by qf(0.5, n1-1, n2-1).
If test statistic F0 is on the RIGHT side of the center, then double the RIGHT side.
P-value= 2* (1-pf(F0, n1-1, n2-1))
& Analysis
Experiments2*
8E pf(F0,
2012 Montgomery
Otherwise, doubleDesign
the LEFT
side.ofP-value=
n1-1, n2-1). Chapter 2
Example 2.3, page 58:
80
In R: One-sided test with f0=14.5/10.8; 1-pf(f0,11,9) = 0.3344771.
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Portland Cement Formulation: Check the equal
variance assumption
81
data.tab2.1<-read.table("http://people.uncw.edu/chenc/STT411/dataset%20backup/Tension-Bond.TXT ",
header=TRUE);
x<-data.tab2.1[,1];
y<-data.tab2.1[,2];
########## check the equal variance assumption ####
var.x<-var(x);
var.y<-var(y);
n<-length(x)
2*(1-pf(var.x/var.y,n-1,n-1))
# or
var.test(x,y);
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Summary of Ch2
82

Sampling distribution
y
~ N (0,1)
/ n
y
~ t (df  n  1)
s/ n

Steps for hypothesis testing

One sample Z-test

CI for population mean when population SD is given

One sample t-test

CI for population mean when sample SD is given
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Summary of Ch2
83







Two sample Z-test
Two sample t-test with equal variance assumption
CI for the difference of two population mean with equal
variance assumption
Two sample t-test without equal variance assumption.
CI for the difference of two population mean with equal
variance assumption
Check normality assumption
Check equal variance assumption with F test
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2
Summary about R
84









pnorm;
qnorm;
pt;
qt;
qqnorm;
qqline;
t.test;
var.test;
pf;
Design & Analysis of Experiments 8E 2012 Montgomery
Chapter 2