Download Statistical Test and Analysis of Variance

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Psychometrics wikipedia , lookup

Omnibus test wikipedia , lookup

Analysis of variance wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
Chapter 2 Statistical Significance Test and Analysis of Variance
2.1 Statistical test for means
In statistical analysis, we use the samples to infer the population features. The definition
of population is “every possible object (or entity) from which the sample is selected”.
Another way of looking at it is as “the complete set of all possible measurements that
might hypothetically be recorded” for the study.
The inference from sample may bias the features of the population. Thus, the statistical
test is required to examine whether the inference correctly catches the features of
population. For example, we wish to examine whether there is a significant difference
between male and female students in learning math.. To answer this question, we take
2000 samples of the math. test, 1000 male students and 1000 female students. A
straightforward step is to compare the mean score of the test between male and female
students. For example, the mean score of male is 85% and female is 82%. The question
here is what can be inferred from the difference between 85% and 82%? Here we need to
focus on the population feature, i.e., the female and male students, rather than the
samples. Thus, there are two issues: 1) whether the averaged value from sample is the
mean of population? 2) a small difference of 3% from sample is really different in
population?
Denote the sample mean x , and the population mean  , so we need to test x =  or not?
The inference depends on the size of sample and the sample variability. The larger
sample size, the more reliable the inference. In addition, the variation in sample affects
the reliability. The increase in sample variation decreases the reliability. Thus, we can
express the unreliability by sample size n and sample variation that can be quantified by
standard deviation of the sample s, namely,
unreliabil ity 
s
,
n
Typically, there are 6 steps for a statistical test, for example, the above test x =  ?
Statistically, the test is also called hypothesis test, which is a format process of asking
whether a logical statement called the null hypothesis should be rejected in favor of an
opposite statement, the alternative hypothesis.
1) Define the null hypothesis, H 0
2) Define the alternative hypothesis, H1 (the logical opposite of H 0 )
3) Specify an alpha value,  , which is the maximum probability we are willing to
accept of committing a Type 1 error;
4) Calculate the test statistic
5) Compare the test statistic with a critical value (from Table)
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
6) Reject the null hypothesis if the test statistic is of greater magnitude than the
critical value
In step 4), a statistical distribution should be specified based on the chosen test
statistic. For example, for the above the mean test, the one-sample t test is specified,
i.e.,
t
x
s/ n
(1)
Several commonly-used distributions are listed below
(1) H0: two mean values from group A and group B, x1  x2 ? (if variance is
not equal)
test statistic:
t
x1  x2
s12 s22

n1 n2
(2)
the statistic follows t-distribution with the number of freedom degree
where n1 ,n2 is the sample size of group A and group B respectively. s12 and s22
are sample standard deviations for group A and group B, respectively.
(2) H0: two mean values from group A and group B, x1  x2 ? (if variance is
equal)
test statistic:
t
x1  x2
1 1
sp

n1 n2
 (x  x )   (x
2
sp 
i
1
i
(3)
 x2 ) 2
(n1  1)  (n2  1)
the statistic follows t-distribution with the number of freedom degree
( n1  n2  2 ), where t n1 ,n2 is the sample size of group A and group B
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
respectively. s12 and s22 are sample standard deviations for group A and group B,
respectively.
(3) H0: two variance values from group A and B, 1   2
Test statistic F 
s22
s12
(4)
the statistic follows F-distribution with the number of freedom degree
( n2  1,n1  1) , the meaning of notations is the same as above.
Example 1: Mean value Test
Let A1 denote a set obtained by drawing a random sample of six measurements:
and let A2 denote a second set obtained similarly:
We will carry out tests of the null hypothesis that the means of the populations from which
the two samples were taken are equal.
The difference between the two sample means, each denoted by xi , which appears in the
numerator for all the two-sample testing approaches discussed above, is
The sample standard deviations for the two samples are approximately 0.05 and 0.11,
respectively. For such small samples, a test of equality between the two population
variances would not be very powerful. Since the sample sizes are equal, the two
forms of the two sample t-test will perform similarly in this example.
unequal variances
If the approach for unequal variances (discussed above) is followed, the results are
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
The test statistic is approximately 1.959. Consulting the student t-table, the critical
value, at
 =0.05, is 2.365 (df =7, probability =0.025)
So, the test statistic is of smaller magnitude than the critical value, thus, accept H0,
i.e., the mean of group A is equal to the mean of group B.
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
Example 2: Statistical test for Variance
In a packing plant, a machine packs cartons with jars. It is supposed that a new machine will
pack faster on the average than the machine currently used. To test that hypothesis, the times
it takes each machine to pack ten cartons are recorded. The results,in seconds, are shown in
the following table.
F
s22
~ F( n2  1,n1  1)
s12
The test statistic is F = 0.5623/0.4617=1.22
The critical value F= 3.18 (  =0.05)
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
2.2 One-way ANOV (Analysis of Variance )
ANOVA is a flexible data analytic technique that allows us to test hypotheses
about means when we have two or more independent variables in the design.
Thus, ANOV is to test the below hypothesis
Case 1: Completely randomized experiment
H0: Multiple mean values from multiple groups, i.e.,
u1  u2  u3....  uk ? (if variance is equal)
The first situation is the sample is completely randomized designed. For example, we
have five female and male high school seniors and obtain SAT mean scores 550 and 590
respectively. We wish to examine whether the sample mean of 550 (female) and 590
(male) is really different? Or can we conclude that the male score 40 points higher, on
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
average, than female? To answer the question, we must consider the amount of sampling
variability among the students. Case 1, if the score is distributed like the upper row
the difference between means is small relative to the sampling variability of the scores
within the group. We would be inclined not to reject the null hypothesis of equal
population means in this case.
In contrast, if the data are as depicted in the dot plot of bottom row, then the sampling
variability is small relative to the difference between the two means. In this case, we
would be inclined to reject the hypothesis.
So, the key point here is to compare the difference between two means against the
variability, as we discussed in t-test. Here there is another way to address the comparison
and result in another test statistic, i.e., F distribution,
F
MST
~ F(k-1, n-k)
MSE
(5)
Where MST is the mean square for treatments, and MSE is the mean square for errors.
k
MST 
SST

k 1
 n (x  x )
i 1
SSE

nk
2
i
(6)
k 1
k
MSE 
i
nk
 (x
i 1 j 1
i, j
nk
 xi )2
(7)
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
k
Actually, SSE   (ni  1) si2
i 1
where si is the standard deviation of group i.
Back the above case, if the square of the standard deviation of male ( s12 ) and female ( s22 )
are 2250, we can calculate MSE and MST
2
SST   ni ( xi  x ) 2 =5*(550-570)^2+5(590-570)^2=4,000;
i 1
SSE=4*2250+4*2250=18,000
MST=SST/(2-1)=4,000; MSE=18,000/(10-2)=2250
Thus, F = MST/MSE=4000/2250=1.78
From F-table, given the alpha value = 0.05, numerator degree of freedom =(2-1);
denominator degree of freedom = (10-2) =8, the Ftable  5.32
F value is smaller than Ftable , accept H0, i.e., there is no significant difference between
male score 590 and female score 550.
Case 2: Randomized block experiment
To alleviate the impact of group inter variability, we design the experiment in blocks, in
each which experiment units are as similar as possible. For example, if we wish to
compare SAT scores of female and male high school seniors, we could select
independent random samples of five female and five males, and analyze the results of the
completely randomized design as discussed above. Or, we could select matched pairs of
females and males according to their scholastic records, for example, GPA, as shown
below. Five such pairs (blocks) are depicted here. For such randomized block
experiment, the test statistic is the same as above, i..e,
ENSC450/650
F
Environmental and Geophysical data analysis --- Chapter 2
MST
~ F (k-1, n-b-k+1)
MSE
SST is the same as before, i.e.,
k
SST   ni ( xi  x ) 2 =5*(606-600)^2+5*(594-600)^2=360;
i 1
SSE cannot directly be calculated in this case. Instead, we calculate the sum of squares
for blocks (SSB);
b
SSB   nbi ( xbi  x ) 2 =2(535-600)^2+2(560-600)^2+2(585-600)^2+2(630i 1
600)^2+2(690-600)^2=30,100
Where nbi is the sample size in each block, xbi is the mean of each block.
The total variation
nbi
k
SS (total)   ( xi , j  x ) 2 =(540-600)^2+(570-600)^2+(590-600)^2+…+(530-600)^2
i 1 j 1
+(550-600)^2+..+(690-600)^2=30,600
Thus, SSE = SS(total)-SST-SSB=30600-30100-360=140
F=MST/MSE=360/(2-1)/140/(10-5-2+1)=10.29
From F-table, given the alpha value = 0.05, numerator degree of freedom =(2-1);
denominator degree of freedom = (n-b-k+1)=(10-5-2+1) =4, the Ftable  7.71 . F value is
greater than Ftable , so reject H0.
The difference between completely randomized experiment and randomized block
experiment can be summarized in below graph
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
2.3 Two -way ANOV (Analysis of Variance )
In section 2.2, we investigated the differences among k level (or treatment) of a single
factor. Here we study the response to changes in two factors; factor A, observed at factor
levels 1, 2,… a; and factor B, observed at factor levels 1,2,…,b. For example, an
engineer in a textile mill may be interested in the effect of temperature and cycle time on
brightness of a fabric in a process involving dye.
Suppose that we have conducted exactly n experiments at every possible factor-level
combinations. Suppose also that (i,j) represents the combination f the ith level of factor A
with the jth level of factor B, where i=1,2,…,a and j=1,2,..,b. Moreover, let us denote the
kth observation in the (i,j) factor-level combination, or cell, Yijk , k=1,2…,n. Thus, we
have a total of a*b*n observations that can be arranged in the below table.
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
From the above mean, we can have the below table too
Define the overall mean of the ab levels in the above table as
u .. 
1 a b
 ui, j
ab i 1 j 1
The row (factor A) means as
ui , . 
1 b
 ui, j i=1,2,…,a
b j 1
The column (factor A) means as
u.j 
1 a
 ui, j j=1,2,…,b
a j 1
To calculate the test statistic, we need to calculate these metrics
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
a
b
SST   b(ui .  u ..)   a(u . j  u ..)2
2
i 1
j 1
SST  SS A  SSB  SS AB
(8)
Test treatment Means
H0: No difference among the ab treatment means
H1: At least two treatment means differ
Test Statistic F 
SST
~ F (ab-1, n-ab ) (n=abk)
SSerror
Test for factor interaction
H0: Factor A and B do not interact to affect the response mean
H1: Factor A and B do interact to affect the response mean
The test statistic FAB (see the below table)
Test for main effect of factor A
H0: No difference among the a mean levels of factor A
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
H1: At least two factor A mean levels different
The test statistic FA (see the below table)
Test for main effect of factor B
H0: No difference among the a mean levels of factor B
H1: At least two factor B mean levels different
The test statistic FB (see the below table)
Example
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
ENSC450/650
Environmental and Geophysical data analysis --- Chapter 2
With F(0.01;2,12)=6.93. That is there is very strong evidence that these differences cannot be
explained as being chance results.