Download CLASS NOTES - Distribution of Sample Means

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
CLASS NOTES: Distribution of Sample Means
CONCEPT
CALCULATION/EXAMPLES
Population: The set of all the
individuals of interest in a
particular study (typically large)
APPLICATION
So, if you hear of a measure being
described as a “parameter” then it
is referring to measures associated
w/ a population. If you hear of a
measure being described as a
“statistic” then it is referring to
measures associated w/ a sample.
Parameter: Any measure obtained
by having measured an entire
population.
Sample: A set of individual
selected from a population, usually
intended to represent the population
in a research study
Statistic: Any measure obtained by
having measured a sample.
Techniques in Sampling
When using inferential statistics,
the sample used should be
representative of the population &
contain the same elements as the
population.
Random Sampling: Demands that
each member of the entire
population has an equal chance of
being included. No member of the
population may be systematically
excluded.
If everyone does not have an equal
chance of being selected, then the
sample is not considered random.
Such is the case if subjects are
allowed to select themselves.
Stratified, or Quota, Sampling:
Samples are selected in such a way
that the same percentages of the
representative population are
present. So, the researcher must
know beforehand what some of the
major population characteristics are
and then, deliberately select a
sample that shares these
characteristics in the same
proportions.
Example:
If a hospital contains
5% ages 12 & below
20% ages 13 to 30
35% ages 31 to 59 &
40% ages 60 & above
Then…
Your sample of this group must
contain the same percentages for it
to be a stratified or quota sampling.
Situation Sampling: The
researcher collects observations in
a variety of settings &
circumstances.
Example:
A sample of cancer patients at 20
different cancer treatment centers
throughout the US.
Sampling error: The discrepancy,
or amount of error, b/t a sample
statistic & its corresponding
population parameter.
Remember that sampling error is a
natural occurrence. It is not a
mistake.
Outliers: When one or two scores
in a large random sample fall so far
from the mean. Either the
distribution is not normal or some
measurement error has crept in.
Bias: When most of the sampling
error loads up on one side so that
the sample means are consistently
either over or underestimated the
population mean.
Bias is a constant sampling error in
one direction.
Distribution of Sample Means:
The collection of sample means for
all the possible random samples of
a particular size (n) that can be
obtained from a population.
Sampling Distribution: A
distribution of statistics obtained by
selecting all the possible samples of
a specific size from a population
(sampling distribution of M).
Population
Sp1
1
Spl
2
M1
M2
Spl
3
M3
XX
Example:
If the entire set contains 100
samples, then the probability of
obtaining any specific sample is 1
out of 100:
The distribution of sample means
contains all possible samples, so it
is necessary to have all the possible
values in order to compute
probabilities.
* Each sample will have its own
individuals, its own scores & its
own sample mean.
* It provides a method for
organizing all of the different
sample means into a single picture
that shows how the sample means
are related to each other & how
they are related to the overall
population mean.
* Sample means should pile up
around the population mean.
* The pile of sample means should
tend to form a normal-shaped
distribution.
* The larger the sample size, the
closer the sample means should be
to the population mean.
P = _1_
100
Central Limit Theorem: Provides
a precise description of the
distribution that would be obtained
if you selected every possible
This serves as a cornerstone for
much of inferential statistics.
The value of the central limit
sample, calculated every sample
mean, & constructed the
distribution of the sample mean.
So, for any population w/ mean µ
& standard deviation of σ / n , &
will approach a normal distribution
as n approaches infinity.
theorem comes from 2 different
facts:
1. It describes the distribution of
sample means for any population,
no matter what shape, mean or
standard deviation.
2. The distribution of sample means
approaches a normal distribution
very rapidly. So, by the time n=30,
the distribution is almost perfectly
normal.
The Central Limit Theorem
identifies w/ the 3 basic
characteristics that describe any
distribution: shape, central
tendency & variability.
Will be almost perfectly normal if
either of the following two
conditions is satisfied:
1. the population from which the
samples are selected is a normal
distribution
2. The number of scores in each
sample is relatively large, around
30 or more. However, increasing
the size of more than 30 does not
produce much additional
improvement in how well the
sample represents the population.
The shape of the distribution of
sample means
The mean of the distribution of
sample means
The expected value of M: The
mean of the distribution of sample
means is equal to µ (the population
mean). I.e, the mean of the
distribution of sample means
always will be identical to the
population mean. This mean value
is called the expected value of M.
The expected value of M is
sometimes expressed as µM, but
since it is always equal to µ, it is
not necessary to notate the
expected value of M as µM.
Population
Sp1
1
Spl
2
M1
M2
Spl
3
M3
µM
Example:
Population = Stress
rate for 100 volunteers
Sp1
1
30
p
Spl
2
50
M1= 74 M2 = 70
Spl
3
20
M3 = 78
MM
MM = ΣM
N
Remember that N in this case
represents the number of samples,
not the number of all participants.
222 = 74
3
The standard error of M: The
standard deviation of the
distribution of sample means. The
standard error measures the
standard amount of differences b/t
M & µ that is reasonable to expect
simply by chance. This is like what
the standard deviation would be to
regular raw data, but instead, we
are working w/ whole sample mean
values, thus changing the formula.
The law of large numbers: States
that the larger the sample size (n),
the more probable it is that the
sample mean will be close to the
population mean.
Standard error of M = σM =
standard distance b/t M & µ
Standard error = σM = _σ_
√N
σM = _σ_ = 2.23 = 2.23 = 1.29
3 1.73
√N
The standard error of M is
represented as σM.
The standard error of M is a very
important as it specifies precisely
how well a sample mean estimates
its population mean, or how much
error you should expect on the
average b/t the M & µ.
The magnitude of the standard
error is determined by two factors:
1) the size of the sample
2) the standard deviation of the
population from which the sample
is selected
There is an inverse relationship b/t
sample size & standard error.
Bigger samples = smaller error;
smaller samples = bigger error. B/c
of this rule, if you have n = 1, then
the standard error & standard
deviation are the same (σM = σ)
So, the equation to the left satisfies
the following 2 requirements:
1) as sample size (n) increases (↑),
standard error decreases (↓).
2) When the sample consists of a
single score (n=1), the standard
error is the same as the standard
deviation (σM = σ).
Example:
Probability & the Distribution of
Sample Means
The primary use of the distribution
of sample means is to find the
probability associated w/ any
specific sample.
population µ = 500
σ = 100
N = 25
B/c of the rules, we know that:
1) the distribution is normal b/c the
population of SAT scores is normal
2) The distribution has a mean of
500 b/c the population mean is 500
What is the probability that the
sample mean will be great than M
= 540?
Step 1:
P(X > 540) = ?
Step 2:
Find your standard error:
3) for n = 25, the distribution has a
standard error of σM = 20
σM = _100_ = _100_ = 20
5
25
Z-score formula for sample means:
Z=M-µ
σM
Step 3:
Locate your z-score
ZM = M - µ = 540 – 500 = 40
σM
20
20
= 2.00
Step 4:
Draw out your distribution
Since your z-score is +2.00, then
your score of 540 falls 2 standard
deviations above the mean.
Step 5:
Find the area beyond the z-score of
+2.00 on your Unit Normal Table.
Your answer would be 0.0228
Answer:
There is a 2.28% chance to obtain a
random sample of n = 25 patients
w/ a patient satisfaction score
greater than 540, so it is very
The difference b/t z-score formula
for an x-value & z-score formula
for a sample mean is that for an xvalue, your numerator is the
standard deviation (σ). For the
sample mean, your numerator is the
standard error (σM).
unlikely.
Here is a chart to help you w/ your new symbols & formulas:
Sample Means
Concept
Symbol
Sample Mean
M
Mean of the Sample
Means
µM
Formula
MM = ΣM
N
or
MM
Standard Error of M
σM
or
SEM
_σ_
√N
To complete the formula
above, you first need to
complete the population
standard deviation, formula
below:
σ =
ΣX2 – (ΣX)2
_
__N__
N
Z-score formula for
sample means
Z
Or
ZM
Z=M-µ
σM
or
ZM = M - µM
σM