Download Sampling Distributions statistics. Distribution. tion is called the

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Sampling Distributions
In general, you do not calculate values for parameters.
Because N is too large, taking a census for this purpose
is not practical.
Instead, you estimate them from a sample using sample
statistics.
However, if you think carefully, you will note that values
of a sample statistic calculated from dierent samples will
be dierent from sample to sample.
That is, if you repeatedly draw new samples (say, of the
same size) from the same population and calculate the
same sample statistic, you will get a set of dierent numerical values. This leads to the concept of Sampling
Distribution.
Sampling Distribution: It is the distribution of a
sample statistic calculated from samples of n measurements drawn randomly from a population.
The sample is chosen randomly, hence the individuals
represented are in it by chance. A sample statistic is
calculated from the individuals represented in the sample;
hence, the number you calculate is obtained by chance
as well. Hence the sample statistic is a random
variable and so has a distribution and this distribution is called the Sampling Distribution.
Sampling Distributions are very complicated to \list out"
because there are so many possible samples that could be
selected (especially when N is large). Instead, what we
derive or approximate the sampling distribution in other
ways: probability theory, statistical mathematical theory,
...
See Example 6.1 for a simple example of a sampling distribution obtained by lising out all the possible values
(\outcomes") of the sample mean X .
Question: Find the sampling distribution of the sample
mean of the number of red M&M's in a random sample
of 4 packages ?
20
21
Sampling Distribution of the Sample Mean
Suppose a random sample of n observations are drawn
from any population with mean and standard deviation , both unknown.
If you decide to use X as a point estimator of , there
are two properties of the sampling distribution of X that
holds regardless of the population from which the sample
was drawn.
Let X = Mean of the sampling distribution of X , and
X = Standard Deviation of the sampling
distribution of X .
Then,irrespective of the population sampled,
1. X = 2. X = pn
X is often referred to as the
standard error of the mean.
These two properties hold whatever the sample size is selected. Further, if the sample was selected from a Normal
distribution, then we know that the sampling distribution
of X is exactly N (X ; X2 )
22
23
What if the population distribution is not Normal ?
Can we still say that
the sampling distribution of X is Normal ?
Yes, we can, using the Central Limit Theorem
(CLT).
It says that the sampling distribution of X will be approximately a Normal distribution for suciently large
n, (say, n 30) even when the sample is drawn from
a possibly, non-Normal population, as long as it is not
heavily skewed.
Of course, it is important to note carefully that
X = and
X = pn
are true independent of whether the population distribution is Normal or not.
The approximation of the sampling distribution of X to
the Normal distribution given by the CLT improves as
the sample size n gets larger.
See Figure
6.10
for a visual illustration of this.
24
Once we can say that the sampling distribution of X is
Normal, we can use that fact to make statements concerning the probability of the sample mean being in a
specied range of values.
A random sample of n = 36 is drawn from
a Normal population with mean = 10 and standard
deviation = 12.
Example:
What is the mean and standard deviation of the sam-
pling distribution of X ?
X = = 10
X = pn = p1236 = 2
What is the shape of the sampling distribution of X ?
Does your answer depend on sample size ?
{ The sampling distribution X is Normal
{ No, since the population is Normally distributed.
Find P r(X > 11).
1
0
11 ; 10 CC
X ; 10
B
B
>
P r(X > 11) = P r @
2
2 A
= P r(Z > 0:50)
= 0:5 ; 0:1915 = 0:3085
Read Examples
25
Example 6.8
A manufacturer of automobile batteries
claims that the distribution of the lengths
of its best battery has a mean of 54 months
and a standard deviation of 6 months.
Suppose a consumer group decides to check
this claim by purchasing a sample of 50
batteries and subjecting them to tests that
determine battery life.
Assuming that the manufactures claim is true,
Describe the sampling distribution of the
mean lifetime of a sample of 50 batteries.
Using the CLT, the required distribution is approximately
Normally distributed. Furthermore,
the mean of this distribution is X = 50,
and the standard deviation is
6
= :85 months
X = p = r
n
(50)
6.7 and 6.8
26
27
What is the probability the consumer group's
sample has a mean life of 52 or fewer
months?
The probability that needs to be calculated can be
stated as P r(X 52).
Since X has an approximate Normal distribution with
mean 54 and standard deviation .85, we can calculate
this probability as follows:
1
0
52 ; 54 CC
X ; 54
B
B
:85 A
P r(X 52) = P r @
:85
= P r(Z ;2:35)
= P r(Z > 2:35)
= 0:5 ; 0:4906 = 0:0094
Thus, the probability the consumer group will observe
sample mean life of 52 or less is 0.0094. If the 50 tested
batteries do have a mean of 52 or fewer months, the consumer group will have strong evidence that the manufactures claim is untrue because the chance of that happening is very small.
28