Download 1 Sampling Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
1
Sampling Distributions
1.1
Statistics and Sampling Distributions
When a random sample is selected the numerical descriptive measures calculated from such a
sample are called statistics. These statistics vary or change for each different random sample
you select; that is they are random variables.
Definition:
Any quantity computed from values in a sample is called a statistic.
The value of a statistic varies from sample to sample this effect is called sample variability.
Since the sampling is done randomly, the value of an statistic is random. In conclusion:
statistics are random variables.
Since statistics are random variables, they have a distribution, that tells which values occur
with which probability.
Definition:
The distribution of a statistic is called a sampling distribution.
They provide this information:
• What are the values the statistic can assume?
• What is the probability of each value to occur?
Remember:
• The population distribution describes the values of a variable in the population. It gives
the possible values of all individuals and their likelihood to occur.
• The sampling distribution describes the values of a statistic in all possible samples from
the population. It describes how a statistic varies in all possible samples of the same size.
Example:
The population is this section of Stat 141. Let µ be the population mean of the random
variable X = height, therefore µ is the average of the heights of all people in this class.
Select a random sample of size 5 and observe x̄, the mean height in this sample.
Is x̄ = µ?
For every random sample the sample mean x̄ is different, this is called the sample variability.
Also is it most unlikely that the sample mean is equal to the population mean µ.
Even if we use a random sample and don’t make any measuring mistakes the x̄ will be most
likely different from µ. This difference is then called the sampling error:
sampling error = x̄ − µ
Now we will study the distribution of X̄.
Suppose you look at every possible random sample of 5 students from this class and the
1
corresponding sample mean. From these numbers you can create the sampling distribution of
X̄.
If this class has 50 students then there are
!
50
= 2, 118, 760
5
different samples.
You will find that
1. The value of x̄ differs from one random sample to another (sampling variability).
2. Some samples produced x̄ values larger than µ, whereas other produce x̄ smaller than µ.
(x̄ − µ sampling error).
3. They can be fairly close to the mean µ, or also quite far off the population mean µ.
The sampling distribution of X̄ provides important information about the behaviour of the
statistic X̄ and how it relates to the population mean µ.
Considering how many different samples of size five there are in this class this process is very
cumbersome. Fortunately, there are mathematical theorems that help us to obtain information
about the sampling distributions.
1.2
The Sampling Distribution of a Sample Mean
x̄ based on a large sample tends to be closer to µ than does x̄ based on a small n. This can be
explained by the following theoretical results.
Lemma: Suppose x1 , . . . , xn are random variables with the same distribution with mean µ
and population standard deviation σ.
Now look at the random variable X̄.
1. The population mean of X̄, denoted µX̄ , is equal to µ.
2. The population standard deviation of X̄, denoted σx̄ , is
σ
σX̄ = √
n
This means that the sampling distribution of X̄ is always centered at µ and the second statement gives the rate the spread of the sampling distribution (sampling variability) decreases as
n increases.
This effect is called the law of large numbers, since it indicates that in average the sample
mean will get closer to the population mean as the sample size increases.
Beside finding the mean and the standard deviation, do we know anything about the shape of
the density curve of the sampling distribution of the mean?
2
1.3
Central Limit Theorem
This section explains why the normal distribution is so important in statistics.
The result is surprising. It states, that under rather general conditions, means of random
samples drawn from one population tend to follow an approximate normal distribution. It
does not matter which kind of distribution we find in the population. It even can be discrete or extremely skewed. But if n is large enough the sampling distribution of the mean is
approximately normal distributed.
Central Limit Theorem
If random samples of n observations are drawn from any population with finite mean µ and
standard deviation σ, then, when n is large, the sampling distribution √of the mean X̄ is
approximate normal distributed, with mean µ and standard deviation σ/ n.
Remark:
• If the population itself is normal x̄ is normal distributed for all n, so that n does not
have to be large.
• When the sampled population has a symmetric distribution, the sampling distribution
of X̄ becomes quickly normal. Compare the example below for n = 3.
• If the distribution is skewed, usually for n = 30 the sampling distribution is already close
to a normal distribution.
Example:
Consider tossing n unbiased dice and recording the average number of the upper faces.
The graphs display the sampling distribution for x̄ for n = 1, 2, 3, 4.
3
Looking at only n = 4 dice leads to a distributions that is very close to a normal distribution.
Summary: Assume that the measurements in a population follow all the same distribution
with finite mean µ and standard deviation σ. Then
• The mean of the sampling distribution of the mean X̄ of n observations holds
µX̄ = µ
• The standard deviation of the sampling distribution of the mean x̄ of n observations
holds
σ
σX̄ = √
n
• The sampling distribution of the mean x̄ of n observations is approximately normal
distributed, if n is large enough.
Example:
In 2004 a water taxi in Baltimore capsized because it was overloaded. 5 people drowned. How
could this have happened, was the accident predictable?
At that time they permitted 20 passengers per load, and knew that the taxi is save up to a
load of 3500 lb.
Assuming that the mean weight of American men is about normally distributed with a mean
µ = 172lb and a standard deviation of σ = 29lb, what was the risk?
1. What is the probability that a randomly chosen man is heavier than 175lb?
2. What is the probability that 20 randomly chosen American men weigh in total more than
3500lb?
ad 1.
P (X > 175) =
=
=
=
46% of American men are heavier than 175 lb.
4
P Z > 175−172
29
P (Z > 0.103)
1 − 0.5398
0.4602
ad 2. If 20 men weigh in total more than 3500lb, it is the same as if they weigh in average
more than 3500/20=175lb. Calculate the probability that the mean weight of 20 men,
X̄, exceeds 175lb. The mean of the sample mean,
√ X̄, is√µX̄ = µ = 172 and the standard
deviation of the sample mean, X̄, is σX̄ = σ/ n = 29/ 20 = 6.68. Then
P (X̄ > 175) =
=
=
=
P Z > 175−172
6.68
P (Z > 0.4629)
1 − 0.6772
0.3228
The probability that the total weight of 20 randomly chosen men exceeds the weight
limit is 0.3228.
Example:
The duration of Alzheimer’s disease from the onset of symptoms until death ranges from 3 to
20 years. The mean is 8 years and the standard deviation is 4 years.
Looking at the average duration for 30 randomly selected Alzheimer patients:
what is the probability that the average duration of those 30 patients is less than 7 years?
7 − µX̄
X̄ − µX̄
<
P (X̄ < 7) = P
σX̄
σ
! X̄
7−8
= P Z< √
4/ 30
!
standardize
= P (Z < −1.37)
= 0.0853
TableA
Example 1 Consider a bottle filling machine, that is supposed to fill 500mL in each bottle.
This machine will be reset if a sample of 10 bottles shows an average filling amount of more
than 505mL.
Assume that the machine is in correct adjustment, that is µ = 500mL, but the filling amount
varies with a standard deviation of 7mL. What is the probability that the machine will be
reset, even though it is correctly adjusted?
505 − µX̄
X̄ − µX̄
>
P (X̄ > 505) = P
σX̄
σX̄
!
505 − 500
√
= P Z>
7/ 10
!
standardize
= P (Z > 2.26)
= 1 − 0.9878 = 0.0122
TableA
The machine would be reset in 1.2% of examinations, even if is correctly adjusted.
5
1.4
The Sampling Distribution of a Sample Proportion
Suppose we are just interested if one characteristic occurred in the population of interest.
For example:
• flip a coin and observe, if Tail was tossed.
• a persons IQ is above 120
• a student is a ”nonresident”
• a person survived at least five years, after a specific cancer treatment
• a clinical test is positive
Then there exists a probability p that a randomly selected individual or object from the
population possesses this characteristic.
Traditionally, any individual or object that possesses the property of interest is labelled a
Success (S), and one one that does not possess the property is called a Failure (F).
Looking at a random sample of the size n, the probability p can be estimated by calculating
the sample proportion (relative frequency) of S’s, which is
number of S0 s in the sample
.
p̂ =
n
Now rewrite the sample proportion in the following way.
First let single observations be encoded in the following way.





0 if individual is labelled F


1 if individual is labelled S
xi = 

Then is p̂ =
P
xi /n = x̄. So that the Central Limit Theorem applies to p̂.
Result:
• µp̂ , the mean of the sampling distribution of p̂, is p the true probability.
• The standard deviation of the sampling distribution of p̂ is
s
σp̂ =
p(1 − p)
n
(since the standard deviation for a single observation is
q
p(1 − p).)
• The sampling proportion p̂ is for large n approximately normal distributed. (Central
Limit Theorem)
6
A conservative rule of thumb states that the Central Limit Theorem can be applied if:
np ≥ 5 and n(1 − p) ≥ 5
Example:
A study showed, that the proportion of people in the 20 to 34 age group with an IQ (on the
Wechsler Intelligence Scale) of over 120 is about 0.35.
Calculate the probability for the event that in a sample of 50 there are more than 30 people
with an IQ of at least 120.
First obtain µp and σp for the distribution of the sample proportion p.
• µp̂ = p = 0.35
• σp̂ =
q
p(1−p)
n
=
q
0.35(1−0.35)
50
= 0.06745.
!
P (p̂ >
30
)
50
p̂ − µp̂
0.6 − µp̂
= P
>
standardize
σp̂
σp̂
!
0.6 − 0.35
p̂ − 0.35
>
use above results
= P
0.06745
0.06745
!
p̂ − 0.35
= 1−P
≤ 3.706
Rule for Compliments
0.06745
= 1 − 0.9999
CLT, np = 17.5 > 5, TableIV
= 0.0001
We calculated that the probability that more than 30 out of 50 people (between 20 and 34)
have an IQ greater than 120 is 0.0001. It is highly unlikely!
Example 2 Assume 80% of Canadians are in favour of protecting the environment through
more laws.
Given that 1000 randomly selected Canadians are asked if they are in favour or against additional laws for protecting the environment, what is the probability that the poll results in a
percentage less than 75
!
P (p̂ < 0.75) =
=
=
=
=
0.75 − µp̂
p̂ − µp̂
<
P
σp̂
σp̂
!
p̂ − 0.8
0.6 − 0.35
P
>
?????0.06745
0.06745
!
p̂ − 0.35
1−P
≤ 3.706
0.06745
1 − 0.9999
0.0001
7
standardize
use above results
Rule for Compliments
CLT, np = 17.5 > 5, TableIV