Download Simple Random Sample

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
7.1 Sampling Distributions (Page 1 of 14)
7.1 Sampling Distributions
Populations and Samples
A population is a set of measurements (conceptual or existing)
and a sample is a subset of those measurements. For example,
a.
the population of ages of all people in Colorado;
b.
the population of weights of all students in your school;
c.
the population count of all antelope in Wyoming.
Simple Random Sample
A simple random sample of n measurements from a population is
one selected in such a manner that
1. every sample of size n from the population has equal
probability of being selected, and
2. every member of the population has equal probability of being
included in the sample.
Parameters versus Statistics
A parameter is a
Measure
Sample Population
numerical measure of a
Statistic Parameter
population. A statistic is
µ
x
a numerical measure of a Mean
Variance
sample. For a given
!2
s2
s
population, a parameter is Standard Deviation
!
p
p̂
fixed, while the value of a Proportion
statistic may vary from
sample to sample. Statistics estimate the value of population
parameters and provide a basis for us to make inferences about the
parameters.
Types of Inferences
1. Estimation
2. Testing
3. Regression
7.1 Sampling Distributions (Page 2 of 14)
Types of Inferences
A statistical inference is a conclusion about the value of a
population parameter based on information about the
corresponding sample statistic and probability. We will do
estimation, testing and regression.
1. Estimation (Chapter 8): In this type of inference, we
estimate the value of a population parameter.
2. Testing (Chapters 9-11): In this type of inference, we
formulate a decision about the value of a population
parameter.
3. Regression (Chapter 10): In this type of inference we make
predictions or forecasts about the value of a statistical
variable.
Sampling Distribution
A sampling distribution is a probability distribution of a sample
statistic based on all possible simple random samples of the same
size from the same population.
7.1 Sampling Distributions (Page 3 of 14)
Example 1 – Sampling Distribution for x
The Pinedale Children’s fishing pond has a five fish limit. Every
child catches the limit. The data in table below is the length of each
fish caught by 100 randomly selected children. The mean x of the
5-fish sample for each child is also tabulated.
7.1 Sampling Distributions (Page 4 of 14)
a. What makes up the members of the sample?
There are 100 members in the sample - the 100 means ( x ’s) of
the 5-fish samples.
b. What is the sample statistic corresponding to each sample?
The sample statistic of each sample is the sample mean, x .
c. What population parameter is being estimated by the sample
statistic x . What population parameter are we are trying to
make an inference about?
The population mean, µ .
d. What is the sampling distribution?
Based on a sample size of 5, the sampling distribution is the
probability distribution for the sample statistic x . It is given
in table below and its graph is given in Figure 7-1.
x
Class
Boundaries
8.39-8.86
8.77-9.14
9.15-9.12
9.53-9.90
9.91-10.28
10.29-10.66
10.67-11.04
11.05-11.42
11.43-11.80
f
Frequency
1
5
10
19
27
18
12
5
3
f/n
Relative
Frequency
0.01
0.05
0.10
0.19
0.27
0.18
0.12
0.05
0.03
7.2 The Central Limit Theorem (Page 5 of 14)
7.2 Central Limit Theorem (Part 1)
Let x be a random variable with a normal distribution whose mean
is µ and standard deviation is ! . Let x be the sample mean
corresponding to a random sample of size n taken from the xdistribution. Then the following are true:
(a) The x -distribution is a normal distribution.
(b) The mean of the x -distribution is the population mean ( µ ).
(c) The standard error is the standard deviation of the x distribution which is ! x = ! / n .
Thus, If x has a normal distribution, then the x -distribution is a
normal distribution for any sample size n and
!x = ! / n
µx = µ
and
Example 2
Suppose a team of biologists has been studying the Pinedale
Children’s fishing pond of example 1 in section 7.1. Let x
represent the length of a single trout taken at random from the
pond. Assume x has a normal distribution with a µ = 10.2 in. and
standard deviation ! = 1.4 in.
(a) What is the probability that a single trout
taken at random from the pond is between
8 and 12 inches?
(b) What is the probability that the mean
length x of 5 trout taken at random
is between 8 and 12 inches?
(c)
Explain the difference between (a) and
(b).
7.2 The Central Limit Theorem (Page 6 of 14)
Central Limit Theorem (Part 2)
If x is any distribution with mean µ and standard deviation ! ,
then, as n increases without limit, the sample mean x , based on a
random sample of size n, will have a distribution that approaches a
normal distribution with mean of µ x = µ and standard deviation
(standard error) of ! x = ! / n .
That is, no matter what the original distribution on x, as the sample
size n gets larger and larger, the distribution of sample means
( x ’s) will approach a normal distribution.
Empirically, it has been found that in the vast majority of cases
picking a sample size of thirty or more ( n ! 30 ) will yield an x distribution that is approximately normal. The larger n gets, the
more normal the x -distribution will become. Thus, for any sample
( n ! 30 ) on any x-distribution, we have the following
approximations: µ x = µ , and ! x = ! / n .
Guided Exercise 2
(a) Suppose x has a normal distribution with mean of 18 and
standard deviation of 3. If we draw random samples of size 5
from the x-distribution and x represents the sample mean,
what can be said about the x -distribution?
(b) Suppose the x distribution has mean of 75 and standard
deviation of 12. If we draw random samples of size 30 from
the x-distribution and x represents the sample mean, what can
be said about the x -distribution?
(c) Suppose you did not know if x had a normal distribution.
Would you be justified in saying that the x -distribution is
approximately normal if the sample size n = 8?
7.2 The Central Limit Theorem (Page 7 of 14)
Example 3
A strain of bacteria occurs in all raw milk. Let x be the bacteria
count per milliliter of milk. The health department has found that
if the milk is not contaminated, then the x-distribution is
approximately normal. The mean of the x-distribution is 2500, and
the standard deviation is 300. In a large dairy an inspector takes 42
random samples of milk each day and averages the bacteria count
of the 42 samples to obtain x .
(a) Assuming the milk is not contaminated, what is (describe) the
distribution of x ?
(b) Assuming the milk is not contaminated, what is the probability
that the average bacteria count for one day is between 2350
and 2650 bacteria per milliliter?
(c) Suppose one day the x for 42 samples was not between 2350
and 2650. Would you pass the sample? Why/Why Not?
7.2 The Central Limit Theorem (Page 8 of 14)
Guided Exercise 3
Tunnels are often used instead of long
roads over high passes. However, too
many vehicles in a tunnel at the same
time can be hazardous due to emissions.
If x represents the time for a vehicle to
go through a tunnel, it is known that the
x-distribution is normal with a mean of
12.1 minutes and a standard deviation of 3.8 minutes. Safety
Engineers have said that vehicles should spend between 11 and 13
minutes in the tunnel. Under ordinary conditions about 50 vehicles
are in the tunnel at a time.
a. What is the probability that the time for one vehicle (x) to pass
through the tunnel is between 11 and 13 minutes?
b. What is the probability that the mean time for 50 vehicles ( x )
to pass through the tunnel will be between 11 and 13 minutes?
c.
Comment on the results.
7.2 The Central Limit Theorem (Page 9 of 14)
Example A
Suppose x = a person’s I.Q. score. The x-distribution is normal
with a mean of 100 and standard deviation of 14.
a. What is the probability that a person selected at random has an
I.Q. (x) between 96 and 104?
b. If a random sample of size n = 5 is drawn, find the probability
that the mean ( x ) of the sample will be between 96 and 104.
c.
Repeat part b for n = 10, 15, 20, and 30.
n
µx
1
5
10
15
20
30
100 14
100 6.2610
100
100
100
100
!x
P(96 ! x ! 104) P(x ! 108)
0.2249
0.4771
7.2 The Central Limit Theorem (Page 10 of 14)
How the x distribution
changes as the
Sample Size
Increases
n = 30
n=5
n=1
x = IQ score
x = mean IQ score
for sample size n
d. As the sample size n
increases what happens to:
i. µ x
ii. ! x
iii. The probability of an x
occurring in an interval
containing µ .
iv. The probability of an x
occurring in an interval
not containing µ (i.e. in
the tails)?
n
1
5
10
15
20
30
!x
14
6.2610
4.4272
3.6148
3.1305
2.5560
P(96 ! x ! 104)
0.2249
0.4771
0.6337
0.7315
0.7987
0.8824
P(x ! 108)
0.2839
0.1007
0.0354
0.0134
0.0053
0.0009
7.2 The Central Limit Theorem (Page 11 of 14)
Exercise 7.2 #17
Let x be a random variable that represents the checkout time in
minutes at the express line at a grocery store. Based on extensive
surveys the mean of the x-distribution is 2.7 minutes with a
standard deviation of 0.6 minutes. What is the probability that the
total checkout time of the next 30 customers is less than 90
minutes? Answer the question in the following steps.
a. Let xi (i = 1, 2, 3, . . . , 30) be the checkout times for the next
30 customers and w = x1 + x2 + x3 +!+ x30 . Explain why we
are being asked to compute P(w < 90).
b. Show that w < 90 is equivalent to x = (w/30) < 3.
c.
What three things does the central limit theorem say about the
x -distribution?
d. Compute P( x < 3) = P(w < 90).
7.3 Sampling Distributions of Binomial Distributions (Page 12 of 14)
7.3 Sampling Distributions for Proportions
6.4 Normal Approximation to a
Binomial Distribution
If np > 5 and nq > 5, then the binomial
random variable r has a distribution
that is approximately normal. The
mean and standard deviation of the
normal distribution are estimated by
µ = np and ! = npq .
!
7.3 Sampling Distributions for Proportions p = r / n
If np > 5 and nq > 5 , then the random variable p̂ = r / n can be
approximated by a normal random variable x whose distribution
has mean µ p̂ = p and standard deviation ! p̂ =
pq / n . The
continuity correction to convert from a p̂ interval to an xinterval is to add 0.5/n to the right endpoint and subtract 0.5/n from
the left endpoint.
Example 4
Suppose n = 25 and the r interval is from 10 to 15. Then the p̂
interval is from 10/25 = 0.40 to 15/25 = 0.60. Find the continuity
!
correction and convert the p interval to an x interval.
Interval on Discrete random variable r
10
to
15
!
Interval on Continuous random variable p
0.40 to
The continuity correction for r is
_____________
Interval on Continuous random variable x
_____ to _____
0.60
7.3 Sampling Distributions of Binomial Distributions (Page 13 of 14)
Example 5
The annual crime rate in the Capital Hill neighborhood of Denver
is 111 victims per 1000 residents. The Arms is an apartment
building in this neighborhood that has 50 residents. Let r be the
random variable that represents the number of victims of crime that
reside in the Arms.
a. What is the probability p that
S=
a resident in this
r
neighborhood will be a
p= =
victim of crime next year?
n
b. Can we approximate the
!
random variable p = r / n
with a normal distribution?
Explain.
np > 5
c.
µ p̂ = p
What is the mean and
!
standard deviation of the p
distribution?
nq > 5
! p̂ =
pq
n
d. What is the probability that between 40% and 60% of the
Arms residents will be victims of crime next year?
!
P(0.40 ! p ! 0.60) = P(____ ! x ! ____) = normalcdf (
e.
What is the probability that between 10% and 20% of the
Arms residents will be victims of crime next year?
!
P(0.10 ! p ! 0.20) = P(____ ! x ! ____) = normalcdf (
7.3 Sampling Distributions of Binomial Distributions (Page 14 of 14)
Guided Exercise 4
The ethnic profile of Denver is 42% minorities and 58%
Caucasian. The city recently hired 56 maintenance workers, but
only 27% were minority. Civil rights groups complained. What is
the probability that at most 27% of new hires are minorities if the
selection is impartial and the applicant pool reflects the ethnic
profile of Denver? Let S = “new hire is a minority.”
n=
a. Assuming impartiality, What
are n, p and q?
p=
q=
b. For the recent sample of new
hires, what is the sample
proportion of success p̂ ?
p̂ =
c.
Is a normal approximation
the p̂ distribution
appropriate? Explain.
np > 5
d. What is the mean and
standard deviation of the p̂
distribution?
µ p̂ = p
e.
! p̂ =
What is the probability P( p̂ < 0.27) ?
P( p̂ < 0.27) = P(x < ____) = normalcdf (
f.
What can be concluded?
nq > 5
pq
n