Download Chapter 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Analysis of variance wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Comparison of groups
The purpose of analysis is to compare
two or more population means by
analyzing sample means and variances.
One-way analysis is used with data
categorized with one treatment (or
factor), which is a characteristic that
allows us to distinguish the different
populations from one another.
Example:
A headline in USA Today proclaimed that “Men,
women are equal talkers.” That headline
referred to a study of the numbers of words
that samples of men and women spoke in a
day. Given below are the results from the
study. Does there appear to be a difference?
Example:
Weights of college students in September and
April of their freshman year were measured.
The following table lists a small portion of
those sample values. (Here we use only a
small portion of the available data so that we
can better illustrate the problem.) Can you
claim some change in weight from September
to April?
Independence assumption
Two or more samples are independent if the
sample values selected from one group are
not related to or somehow paired or
matched with the sample values from the
other groups.
Two groups can be dependent if the sample
values are paired. (That is, each pair of
sample values consists of two
measurements from the same subject (such
as before/after data), or each pair of sample
values consists of matched pairs (such as
husband/wife data), where the matching is
based on some inherent relationship.)
Identifying Means That Are Different
Informal methods for comparing means
1. Use the same scale for constructing
boxplots of the data sets to see if one or
more of the data sets are very different from
the others.
2. Calculate the mean for each group, then
compare those means to see if one or more
of them are significantly different from the
others.
Analysis of Variance
Fundamental Concepts
Estimate the common value of  :
2
1. The variance between groups (also called
variation due to treatment) is an estimate of the
2
common population variance  that is based
on the variability among the sample means.
2. The variance within groups (also called
variation due to error) is an estimate of the
2
common population variance  based on the
sample variances.
Analysis of Variance
Requirements
1. The populations have approximately normal
distributions.
2. The populations have the same variance 
(or standard deviation  ).
2
3. The samples are independent of each other.
4. The different samples are from populations
that are categorized in only one way.
Key Components of
Analysis of Variance
SS(total), or total sum of squares, is a
measure of the total variation (around x) in
all the sample data combined.

S
S
t
o
t
a
l

x

x

2
Key Components of
Analysis of Variance
SS(treatment), also referred to as Sum of
Squares between groups, is a measure of
the variation between the sample means.
S
S
t
r
e
a
t
m
e
n
t



n

x
n

x
n

x
x

x

x

1
1
2
2
k
k
2
2

n
x
x


i
i
2
2
Key Components of
Analysis of Variance
SS(error), also referred to as Sum of
Squares within groups, is a sum of squares
representing the variability that is assumed
to be common to all the populations being
considered.
S
S
e
r
r
o
r




1
n

1
n
1
n
s

s

s
1
2
k
2
1

1
s
n


i
2
i
2
2
2
k
Key Components of
Analysis of Variance
Given the previous expressions for
SS(total), SS(treatment), and SS(error),
the following relationship will always
hold.
SS(total) = SS(treatment) + SS(error)
Mean Squares (MS)
MS(treatment) is a mean square for
treatment, obtained as follows:
MS(treatment) =
SS (treatment)
k–1
MS(error) is a mean square for error,
obtained as follows:
MS(error) =
SS (error)
N–k
N = total number of values in all samples combined
Identifying Means That Are Different
F=
MS (treatment)
MS (error)
After checking a significance of the
ratio F of mean squares (MS), we
might conclude that there are
different population means. But it
cannot show that any particular
mean is different from the others.