Download Statistical Inference Theory Lesson 28 The CENTRAL LIMIT

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Statistical Inference Theory
Lesson 28
The CENTRAL LIMIT
THEOREM
436
A population is a collection of numeric data. Each population can be considered a probability sample
space with a given distribution.
Inference theory allows one to take an appropriate sample of a given population and from this sample
make specific judgements about the entire population.
For example, assume you are a reporter on the newspaper of a local community college, whose enrollment is
twenty thousand, and your assignment is to find the average age : of all the enrolled students. There are two
ways you could proceed:
1. By some means, find the age of each of the 20,000 students and average their ages. If this can be done, then
you would have computed :.
2. Take a representative sample, of say 100 students. From each of these students ask and record their age. From
this sample you can compute the average age 0.
If the second method is used, then you will use the value 0 in place of the average : of the whole population.
Such a process is making an inference about the mean : of the whole population, the entre student body.
In order to use 0 as an estimator, we need the central limit theorem which allows us to examine the
distributions of
and other distributions.
28.1-What is the Central Limit Theorem for
?
Let {Xk} be a sequence of mutually independent random variables 1 with a common distribution, generated from
a sample of size n drawn from a population. Suppose that : = E(Xk) and F2 = Var(Xk) are finite.
Define the random variable:
1
See Supplementary problems in lesson 15 for definition
Statistical Inference Theory
Lesson 28
The Central Limit Theorem
The central limit theorem states the following about the distribution of the random variable
1. For a large sample (n $30),
437
:
is approximately normally distributed.
2. The mean
3. The standard deviation of
is called the standard error of the mean.
4. If F is known, the distribution of
is approximately normally distributed with mean 0 and standard deviation 1.
5. If F is not known and s is the standard deviation of the sample, we use s in place of F,
in place of
and the distribution of
is approximately normally distributed with mean 0 and standard deviation 1.
28.1 - Example 1: Past records of the student body at a large university show that the mean age is : = 23.5 years
with a standard deviation F = 3.1 years. A sample of n = 100 students is taken at random. We define to be
the average age of the sample. Find
(a).
(b). the probability that the average age
of this sample is at least 24 years old.
(c). the probability that the average age
is between 23.1 and 24.1 years old.
438
Statistical Inference Theory
(d). the probability that the average age
Lesson 28
is at most 23 years old.
The Central Limit Theorem
1
Solutions:
' (a).
We are given that F = 3.1 and the sample size taken is n = 100.
From the Central Limit Theorem, we have
' (b).
Step 1:We use the formula
to find the area under the normal distribution curve for
fig. 1
= 24.
: = 23.5
= 0.31
Step 2: z =
Step 3: From the normal distribution tables, P{
$ 24} = 0.5 - 0.4463 = 0.0537
' (c).
We use the formula
2
.
fig. 2
Step 1: For
= 23.1,
. -1.29
Step 2: For
= 24.1,
Statistical Inference Theory
Lesson 28
The Central Limit Theorem
439
3
. 1.94
fig. 3
From the normal distribution table, the area is
P{23.1#
# 24.1} = 0.4015 + 0.4738 = 0.8753 .
' (d).
We use the formula
4
.
For
= 23,
. -1.613
z=
fig. 4
From the table, P{
# 23} = 0.5 - 0.4463 = 0.0537 .
28.1 - Example 2: A local fish packing company packs 50 gallon containers with 100 pounds of fish. Assume
each month a government agency randomly selects 49 containers and computes the average weight. If this
average of these containers is less than 100 pounds than the company is fined. Find the value : the company
should strive for to assure that they will not be fined more than 2% of the
time. Assume a standard deviation F = 5 pounds.
5
Solution:
Step 1: To solve for :, we use the formula
Step 2:
.
= 100
Step 3:
fig. 5
Step 4: From the figure, we need to look up the area 0.48 from the normal distribution table: z = -2.05 .
Step 5:
= 100 - (-2.05)0.71 . 100 + 1.45 = 101.45 pounds.
440
Statistical Inference Theory
Lesson 28
The Central Limit Theorem
28.1 - Example 3: The American Bubble Company recently purchased a new machine to fill 16 ounces of spring
water. To check if the machine is filling a proper amount of water, they sample 100 bottles each hour. If the
average fill of these bottles is less than c* ounces, than the machine is stopped and adjusted. Assuming F = 0.5
ounces, find c* so that the chance the machine is stopped when properly functioning is 0.01 .
Solution:
Step 1: To solve for c*, we use the formula c* = : + z
.
6
Step 2: : = 16
Step 3:
fig. 6
Step 4: From the figure, we need to look up the area 0.49 from the Normal
distribution table: z = -2.33 .
Step 5: c* = : + z
= 16 - 2.33(0.05) = 15.88 ounces.
Solved Problems
28.1 - Solved Problem 1: The average life of 100 watt light bulbs produced by a company is µ = 1,890 hours
with a standard deviation F = 150 hours. A sample of n = 400 of these bulbs is selected at random. Find
(a).
.
(b). the probability that the average life of this sample is at most 1,900 hours.
(c). the probability that the average life is between 1,900 and 2,000 hours.
(d). the probability that the average life is greater than 1,950 or less than 1,875 hours.
Solutions:
' (a).
We are given that F = 150 and the sample size taken is n = 400. From the Central Limit Theorem, we have
= 7.5 .
' (b).
Step 1: We use the formula
Statistical Inference Theory
Lesson 28
The Central Limit Theorem
to find the area under the normal distribution curve for figure 7.
fig. 7
7
= 1,900
: = 1,890
= 7.50
Step 2:
Step 3: From the normal distribution tables, P{
# 1900} = P{z # 1.33} = 0.4082 + 0.5 = 0.9082 .
8
' (c).
From the normal distribution table,
Step 1:
fig. 8
Step 2:
Step 3: P{1900 #
# 2000} = P{1.33 # z # 14.67} = 0.5 - 0.4082 = 0.0918
' (d).
We use the formula
Step 1: For
= 1950,
Step 2: For
= 1875,
9
=8
fig. 9
P{
$ 1950} + P{ # 1875} = P{z $ 8} + P{z # -2} = 0 + 0.5 - 0.4772 = 0.0228
441
442
Statistical Inference Theory
Lesson 28
The Central Limit Theorem
28.1 - Solved Problem 2: A machine is filling 1,000 cans hourly with 16 ounces of coffee. Each hour, a sample
of 200 cans is randomly selected and checked for weight. If the average of these 200 cans weigh more than 16
ounces, the machine is stopped and adjusted. Assume a standard deviation F = 1.5 ounces. What value : should
the company set the machine to assure that the process will be stopped no more than 5% of the time.
10
Solution:
Step 1: To solve for :, we use the formula : =
= 16
Step 2:
fig. 10
. 0.106
Step 3:
Step 4: From the figure, we need to look up the area 0.45 from the normal
distribution table: z = 1.64 .
= 16 - (1.64)0.106 . 16 - .17 = 15.83 ounces.
Step 5:
28.1 - Solved Problem 3: The American Bubble Company recently purchased a new machine to fill 16 ounces
of spring water. To check if the machine is filling a proper amount of water, they sample 100 bottles each hour.
If the average fill of these bottles is more than c* ounces, than the machine is stopped and adjusted. Assuming
F = 0.5 ounces, find c* so that the chance the machine is stopped, when properly functioning is 0.03.
Solution:
11
Step 1: To solve for c*, we use the formula c* = : + z
fig. 11
.
Step 2: : = 16
Step 3:
Step 4: From the figure, we need to look up the area 0.47 from the Normal
distribution table: z = 1.88 .
Step 5: c* = : + z
= 16 + 1.88(0.05) =16+ 0.094 = 16.094 ounces.
Unsolved Problems with Answers
28.1 - Problem 1: A machine bores on average 1 cm holes in a metal plate with a standard deviation of
0.01 cm. A sample of 100 plates are taken. Find
(a).
.
(b). the probability that the average size hole for this sample is greater than 1.002 cm.
Statistical Inference Theory
Lesson 28
The Central Limit Theorem
443
(c). the probability that the average size hole for this sample is between 1.002 and 1.003 .
(d). the probability that the average size hole for this sample is between .999 and 1.003 .
Answers:
' (a).
= 0.001
' (b).
0.0228
' (c).
0.0215
' (d).
0.84
ƒ Refer back to 28.1 - Example 1 & 28.1 - Solved Problem 1.
28.1 - Problem 2: A local fish packing company packs 50 gallon containers with 100 pounds of fish. Assume
each month, the company randomly selects 36 containers and computes the average weight. If the average of
these containers is more than 100 pounds, then the company has to repack the containers. Find the value : that
will cause the company to repack 10% of the time. Assume a standard deviation F = 6 pounds.
Answer:
: = 98.72 pounds
ƒ Refer back to 28.1 - Example 2 & 28.1 - Solved Problem 2.
28.1 - Problem 3: A fishing company catches all its fish using nets. Government regulations require that the
average length of a fish caught is 15 inches. After each catch, the company samples the length of 49 fish from
its nets. If the average length is less than c* inches, all the fish are returned to the water. Assume on a given day
that the average length of the catch is 15 inches with a standard deviation of 1.4 inches. Find c* so that the
chance is only 5% that all the catch will be returned to the water.
Answer:
c* = 14.67 inches
ƒ Refer back to 28.1 - Example 3 & 28.1 - Solved Problem 3..
Supplementary Problems
1. The records of a local men's health club show that the average lifting weight is 178 pounds. A random sample
of 100 club members shows that 40% of the men can lift more that 179 pounds. For all members, find the
standard deviation F.
2. A computer selects, with replacement, 36 numbers from the set {0, 1, 2, 3, 4..., 100}.
444
Statistical Inference Theory
Lesson 28
The Central Limit Theorem
, find :.
a. Using the formula
, find F.
b. Using the formula
c. For this sample, find the probability that P{
$ 60}.
d. If only one number is selected at random, find P{
$ 60}.
e. Use the Central Limit Theorem to find a sample size N for
f. Find the smallest sample size where P{
= 4.86 .
$ 60} = 0.01 .
3. College's records show that the grade point average (G.P.A.) of all female students is 2.95 with a standard
deviation of 0.2 and a G.P.A of 2.94 for all male students with a standard deviation of 0.25. A random sample
of 200 female students and 100 male students was taken. Find the probability
a. that the average G.P.A. of the sampled female students and male students is greater than 2.97 .
b. that the average G.P.A. of the sampled female students or male students is greater than 2.97 .
4. The American Bubble Company recently purchased a new machine to fill 16 ounces of spring water. To check
if the machine is filling a proper amount of water, they sample each hour 100 bottles. If the average fill of these
bottles is less than 15.85 ounces, than the machine is stopped and adjusted. Assuming F = 0.7 ounces, find the
probability that over a 5 hour period, the machine will be stopped 1 time.
For any sequence of discrete random variables X1, X2, ..., Xn , we define the joint distribution of any subset Xi,
Xj,..., Xr as
P{Xi = xk, Xj = xw,..., Xr = xt} = P[{Xi = xk}1{Xj = xw}1...1{Xr = xt}].
5. A fair die is tossed twice. Let X1 equal the outcome on the first toss and X2 the outcome on the second toss.
a. Compute the distribution of 0 by completing the following table:
P{
}
b. Compute : = E(X1), : = E(X2) and E(0 ).
c. Compute F2,
d Show
2
F0
Statistical Inference Theory
Lesson 28
The Central Limit Theorem
445
e. Show F0 =
f. Compute P{X1 > 3.5} and P{
> 3.5}.
6. Assume a binomial experiment with N independent trials where p is the probability of success on each trial.
a. Show : = Np.
b.
7. If X and Y are two discrete, independent random variables, show E(XY) = E(X)E(Y).
8. A sequence of mutually independent random variables is called a Bernoulli sequence if P{Xk = 1} = pk and
P{Xk = 0} = 1 - pk = qk (k = 1, 2,..., N).
a. If S = X1 + X2 + ... + XN , show E(S) = p1 + p2 + ... + pN.
b.Show,
9. Assume {Xk} (k = 1,...,n} is a sequence of random variables satisfying the Central Limit Theorem.
a. Show E( ) = :.
b. In lesson 16, problem 13 we showed
Show
.
.
10. Assume the following population S = {2,10}. A sample (with replacement ) of size N = 30 is taken from
this population where P{Xk = 2} = 1/2 and P{Xk = 10} = 1/2 (k = 1,...,30).
a. Find : and F2.
b. Define
as the population of all averages
Find the size of population
.
c. List all 31 distinct numbers of
d. Find the distribution of
generated by all possible samples.
.
for the population
.
e. Find a summation formula for
f. Using the central limit theorem, evaluate the sum in d.
g. Assume a sample of N = 30 is taken. From the distribution of
4.9 # # 6.8 .
, find the probability that
446
Statistical Inference Theory
Lesson 28
The Central Limit Theorem
h. Use the central limit theorem to approximate an estimate of P{4.9 #
# 6.8}
has mean : = 0 and F = 1.
11. Show the random variable
12. Assume s is the standard deviation computed from a sample of size N. Find : and F of
where
.
13. In a small European country the law permits a maximum of 4 automobiles per family. Their department of
transportation recently did a study and found the following distribution of number of automobiles owned:
51% of the families own 1 automobile; 23% own 2 automobiles; 17% own 3 automobiles and 9% own 4
automobiles.
Recently 100 families renewed their automobiles registration. Find
a. :.
b.
.
c. For these 100 families estimated the probability on average they own at least 2 automobiles.
14. Assume the following game is played: a fair die is tossed once and the resulting value is recorded.
a. Write out the population.
b. Find : and F.
c. If this game is played 64 times, find the probability that the average score is between 4 and 5.
15. Assume the following game is played: five cards are drawn without replacement from an ordinary deck of
cards and the number of diamonds is recorded.
a. Write out the population.
b. Find :, F.
c. If the game is played 100 times, find the probability that the average number of diamonds drawn is
less than 2.