Download 10/25

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Psych 524, 10/24/05
p. 1/4
Two Sample tests for Means (based on Kirk, Ch. 12)
Introduction (12.1)
In most circumstances, we do not know the population mean.
So, instead of comparing one sample mean to one known population mean, we
will take two samples to estimate two population means.
So, for example, we might gather data from:
a control group and a comparison group
males and females
babies in Seattle and babies in Sacramento
For both samples, we do not know the population mean, so we must estimate
these parameters from our sample data.
Our question is whether the samples were drawn from the same population.
If they were drawn from the same population, what would we expect the
difference between the means to be?
Two Sample z for Means Using Independent Samples (SD known!) (12.2)
When we study independent samples, we are studying samples that are not related
to each other. The individuals in these samples are in no way related. For
example:
males vs. females
freshmen vs. seniors
babies in Seattle vs. babies in Sacramento
If samples were related in some way (e.g., brothers vs. their sisters, husbands
versus wives, pretest vs. posttest), we would be using dependent samples. We
will return to this topic later.
In the very rare circumstance that we know both population standard deviations
but do not know the population means, we must collect two samples. Because
the variances/SD’s are known, we will be using the z distribution.
What is our null hypothesis?
Psych 524, 10/24/05
p. 2/4
Now, let’s figure out what our test statistic, z, will look like.
In general, test statistics can be conceptualized as follows:
test statistic = (sample estimate – population estimate)/ std. error of estimate)
What does this look like for the independent two-sample z?
Numerator
sample estimate:
population estimate:
Example: Let’s say that we sample Sacramento and Seattle babies and compute
means to be 7.1 and 7.4 pounds, respectively. What will the numerator of our
z-test look like?
Is this statistically significant?
Denominator
The standard error of the estimate here is called the “standard error of the
difference between two means”
Because the standard deviations might be different for the two populations, we
need to weight them to get an estimate of the “standard error of the difference
between two means”
The standard error of the difference between two means (when variances are
known) is denoted as  X1  X 2 and is computed as:
 X  X   12 / n1   22 / n2
1
2
Example: If Sacramento’s σ is known to be .8 and Seattle’s to be .82, and we
took samples of size 50 for each city, what would our standard error be?
Psych 524, 10/24/05
p. 3/4
Calculating z
Putting it all together, we have:
z=
(X1  X 2 )  0
 / n1   / n2
2
1
2
2

(X1  X 2 )
 / n1   22 / n2
2
1
We look up the obtained value and compare it to the critical value just as we have
in the past.
Example: Compute this for the birthweight example.
Two Sample t for Means Using Independent Samples (SD unknown!) (12.4)
In most circumstances, we do not know the population means or standard
deviations, so we must estimate them from samples.
What is our non-directional null hypothesis?
Again, our test statistic will fit the following generalization:
test statistic = (sample estimate – population estimate)/ std. error of estimate
What does this look like for the independent two-sample t?
Numerator
sample estimate:
population estimate:
Example: Let’s say that we sample Sacramento and Seattle babies and compute
means to be 7.1 and 7.4 pounds, respectively. What will the numerator of our
z-test look like?
…same as for z!
Psych 524, 10/24/05
p. 4/4
Denominator
Again, we want the “standard error of the difference between two means”
Computing this is a bit more complicated because we must first come up with an
estimate for the population standard deviation (though we use variance—the
square of SD—in the formula).
^
We call our estimate of the population standard deviation  pooled , and it is based
on a weighted average of the estimated standard deviations generated from
both samples. We assume homogeneity of variance (equal variances for the
population because, under the null, the samples were drawn from the same
population.
We can compute it using the following equivalent formulas (the formula we use
will be determined by the information we are given):
^ 2
^
 pooled 
^ 2
(n1  1)  1  (n2  1)  2

(n1  1)  (n2  1)
^ 2
^ 2
df1  1  df 2  2

df1  df 2
SS1  SS 2
df1  df 2
Now that we have and estimated standard deviation, we must obtain an estimated
standard error,  X  X . Like with other standard errors, we need to take into
1
2
account the sample sizes.