Download Sampling and the Standard Error of the Mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Gibbs sampling wikipedia , lookup

Regression toward the mean wikipedia , lookup

Sampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
FACTSHEET 18
Sampling/Standard Error of the Mean
In other circumstances stratified sampling is more appropriate. This involves dividing the population
into like (or “homogeneous”) groups (e.g. in terms of age, sex or other attributes), then the proportions
of each homogeneous group in the whole population can be reflected in the sample. This method
is sometimes used in selection of samples for opinion polls.
However, if many samples are taken from the same population, it is still unlikely that they will all
have means and SDs which are identical, either to each other or to the whole population. There will
be sampling error. This is not necessarily the result of mistakes made in sampling procedures,
rather, variations may occur due to the chance selection of different individuals.
As a result of sampling error, we will find that if we take a large number of samples from the same population
and measure the mean for each, the sample means will not be identical. Some means will be relatively high,
some relatively low, and most will be clustered around an average mean. In fact the means themselves
would be approximately normally distributed.
We can plot this distribution of the means as a frequency polygon as shown below.
We can also calculate the standard deviation of the distribution of sample means. If the means are
“spread out” (ie lots of variability) this standard deviation will be high, indicating high sampling
error. If the means are tightly distributed around the average of the means (ie low variability) the
standard deviation will be low indicating low sampling error.
© 2009 Knight Chapman Psychological Ltd. All rights reserved.
BPS Occupational
Testing Level
BPS Occupational
B Intermediate
Testing
Level A
Earlier, we discussed how the mean and SD will vary if subsamples are drawn from a population in
such a way as to bias the data. For example, if only first form pupils are included the height will be
biased towards shorter measurements. If we wished to estimate the mean and SD for the whole
population without measuring everybody we would need to sample in such a way as to achieve a
representative sample. Sometimes this is best achieved by random sampling, where each
member of the total population has an equal chance of being selected. Drawing names out of a hat
would be one method of random sampling.
FACTSHEET 18
Sampling/Standard Error of the Mean
This measure of sampling error is the Standard Error of the Mean. This may be defined as the
standard deviation of the theoretical distribution of sample means drawn from the same population.
If, therefore, our samples are large, each large sample should approximate closely to the whole
population and to each other. Variability between these large samples will be low and so the
Standard Error of the mean will also be low. In short, large samples give less sampling error and
therefore lower Standard Error of the mean.
If our samples are all small, sampling error will be great. Each sample will differ more from each
other and from the whole population. Variability between these small samples will be great, and so
the Standard Error of the mean will be great.
The formula for computing the Standard Error of the mean is:
SEmean =
SD Sample
√N
A sample mean is 75 and SD is 18. The sample size is 36.
SEmean = SD sample
√N
= 18
√36
=18
6
=3
What does this figure mean?
To interpret this SEmean , we need to consider “Levels of Confidence”.
© 2009 Knight Chapman Psychological Ltd. All rights reserved.
BPS Occupational
Testing Level
BPS Occupational
B Intermediate
Testing
Level A
Finally, we can calculate the average of the sample means. This is called the population mean,
because this should be the same as (or close to) the mean we would find if we included the whole
population in our sample.
FACTSHEET 18
Sampling/Standard Error of the Mean
Earlier, we saw that the normal distribution curve has the property that 68.26% of cases will fall
between 1 SD above and 1 SD below the mean; 95.44% will fall within 2 SDs of the mean, and so
on.
In our example above we calculated (using the formula) that for a sample mean of 75, SD of 18,
and sample size of 36, the SEmean was 3.
There is, therefore, a 68.26% chance that the population mean falls within 1 SEmean of the sample
mean. In this example then there is a 68.26% chance that the population mean falls within 3 scores
of the sample mean (75), or, in other words, a 68.26% chance that the population mean falls
between 72 and 78.
Similarly, there is a 95.44% chance that the population mean falls within 2 SEmeans of the sample
mean. In our example there is therefore a 95.44% chance that the population mean falls in the
range 69 - 81.
Another way of phrasing this would be to say that we can be (approx) 95% certain that the mean
for the whole population is in the range 69 - 81, but only (approx) 68% certain that it falls in the
range 72 - 78.
You would calculate the Standard Error of the Mean to provide you with an indication
of whether a particular norm group is representative of the total theoretical popluation
-the smaller the SEmean, the more representative the norm group is.
© 2009 Knight Chapman Psychological Ltd. All rights reserved.
BPS Occupational
Testing Level
BPS Occupational
B Intermediate
Testing
Level A
This rule applies to all normal distributions, and so can be used in looking at our distribution of
sample means drawn from the same population.