Download is called a point estimate for μ

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
CHAPTER SIX
SAMPLING DISTRIBUTION FOR MEANS AND PROPORTIONS
In general the population characteristics will be represented by letters
from the Greek alphabet while sample characteristics will be represented
by latin letters.
In statistical inference, the mean and the variance calculated from sample
data are used to estimate the population mean and variance, hence
š‘„Ģ… is called a point estimate for Ī¼
s2 is called a point estimate for ļ³2.
The mean of all sample means is written as šœ‡š‘„Ģ… .
The standard deviation of all sample mean is written as šœŽš‘„Ģ… .
šœŽš‘„Ģ… =
āˆš
š‘āˆ’š‘›
š‘āˆ’1
šœŽ
š‘āˆ’š‘›
āˆšš‘› š‘ āˆ’ 1
āˆš
is called the finite population correction factor and must be used
when sampling from finite population.
As a rule of thumb, when the sample size is less than 5% of the
population size.
It can be proved (mathematically) that the probability distribution of the
sample means for smple size greater than 30, selected from any
population (whose mean and variance are known) approaches a Normal
šœŽ
distribution with mean Ī¼ and standard deviation šœŽš‘„Ģ… =
āˆšš‘›
This is called the Central Limit Theorem.
The distribution of sample means š‘„Ģ… ~š‘ (šœ‡;
šœŽ
āˆšš‘›
) for sample size nā‰„30.
In addition, it can also be proved (mathematically) that the Central Limit
Theorem applies for small samples selected from Normal populations
when the population variance Ļƒ2 is known.
š‘„Ģ… ~š‘ (šœ‡;
šœŽ
āˆšš‘›
) for samples of any size from a Normal distribution, known
variance Ļƒ2
A direct application of the Central Limit Theorem is the
(i)
(ii)
calculation of probabilities regarding sample means;
calculations of the limits that contain various percentages of the
mean.
Example
An importer of Herbs and Spices claims that the average weight of
packets of Saffron is 20 gms. However, packets are actually filled to an
average weight Ī¼= 19,5 gm., standard deviation Ļƒ = 1,8 gm. A random
sample of 36 packets is selected, calculate:
a) the probability that the average weight is 20 gms or more;
b) the two limits within which 95% of all packets weight;
c) the two limits within which 95% of all average weights fall.
a) The question ask for probability P(xā‰„20). It is necessary to use the
probability distribution of the sample mean, that is Normal with Ī¼ =
19,5 and Ļƒ= 1,8/(n)1/2 .
For š‘„Ģ… = 20, š‘ =
š‘„Ģ… āˆ’šœ‡š‘„Ģ…
šœŽš‘„Ģ…
=
20āˆ’19,5
1,8ā„6
= 1,67
From Tables the area in the tail is 0,0475.
b) The question asks for individual weights, not average.
The Z values that correspond to a tail area of 0,025 are -1,96 and
+1,96..
The upper limit is Ī¼+ZĻƒ= 19,5 + 1,96 (1,8) = 19,5 + 3,528 = 23,028.
The lower limit is Ī¼-ZĻƒ = 19,5 ā€“ 1,96(1,8) = 19,5 ā€“ 3,528 = 15,672.
c) This part asks for sample averages. The method is the same as in
part b) except that the standard for average is šœŽš‘„Ģ… = šœŽā„āˆšš‘›.
The upper limit is given by Ī¼+ZšœŽš‘„Ģ… = 19,5 + 1,96 (0,3) = 19,5 +0,588=
20,088.
The lower limit is given by Ī¼-ZšœŽš‘„Ģ… = 19,5 -1,96 (0,3) = 19,5 -0,588=
18,912.
Sampling distribution of proportion
A proportion is the number of elements with a given characteristic
divided by the total number of elements in the group.
The sample proportion, p, is the point estimate of tha population
proportion, Ļ€.
The standard error for proportion is given by the formula
šœŽš‘ = āˆš
šœ‹(1āˆ’šœ‹)
š‘›
āˆš
š‘āˆ’š‘›
š‘āˆ’1
If N is large, the second term goes to zero and
šœŽš‘ = āˆš
šœ‹(1 āˆ’ šœ‹)
š‘›
The sampling distribution of proportion
The list of every possible sample proportion, with its probability, is called
the sample distributions of proportions.
For large samples (nā‰„30)
š‘~š‘(šœ‹, āˆš
šœ‹(1āˆ’šœ‹)
š‘›
Example
In a certain neighbourhood, it is known that 12% of youths aged from 16
to 24 are unemployed.
If a random sample of 150 youth are selected what is the probability that
the sample contains
At most 10% unemployed
At most 15 unemployed
If 12% are unemployed Ļ€=0,12 e Ļƒp = āˆš
0,12 (1āˆ’0,12)
150
= 0,0265.
P(pā‰¤0,10).
Z=
š‘āˆ’šœ‹
šœŽš‘
=
0,10āˆ’0,12
0,0265
= -0,75
The tail area is 0,2266.
The probability that at most 10% of sample is unemployed is 0,2266.
Mean and standard errors for the distribution of proportion when Ļ€ is
unknown
The calculation of the mean and standard error of sample proportion
depends on knowing the value of the population proportion, Ļ€. Since Ļ€ is
seldom known, the mean and standard error for proportions are
approximated by substituting p for Ļ€. Hence
š‘Ģ… = š‘, š‘ š‘ = āˆš
š‘(1 āˆ’ š‘)
š‘›
Some desirable properties of estimators
Two of the most important properties of point estimators are that they
should be:
1. Unbiased
2. Minimum variance
Unbiased estimators
An estimator is said to be unbiased if the average value of all the point
estimates is equal to the population parameter to be estimated.
The mean š‘„Ģ… is an unbiased estimator of Ī¼ and the sample proportion p is
an unbiased estimator for Ļ€.
Minimum variance
The values of sample statistics (sample means and proportios) can vary
greatly about the population parameter being estimated. It is obviously
desiderable to keep this variation (as measured) by standard error as
small as possible (minimum variance). Increasing sample size reduces the
standard error.
An estimator is described as precise when the values of the estimates are
close. The standard deviation of all the estimates (called standard error) is
a measure of the precision of an estimator. Ideally an estimator should be
unbiased and have minimum variance (both accurate and precise).
The sample mean and proportion are unbiased and minimum variance
estimators of the population mean and proportion.