Download p p - Columbia Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Randomness wikipedia , lookup

Generalized linear model wikipedia , lookup

Probability box wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Transcript
Review - Week 8
Read: Chapters 17-19
Review: Sampling distribution model for proportions
Population - Entire group of items/individuals we want information about.
Sample - The part of the population we actually examine in order to gather information.
A parameter is a number that describes the population.
A statistic is a number that describes a sample.
The population distribution is the probability model derived from information on all members of
the population.
Any individual chosen at random from this population will follow the same probability model.
Different random samples give different values for a statistic. The sampling distribution model is
an idealized mathematical description of the results of all possible samples of size n.
When the population is at least 10 times as large as the sample and both np ≥ 10 and
n(1 − p) ≥ 10 , then the sampling distribution model for the sample proportion p̂ is approximately
⎛
p(1 − p) ⎞
⎟.
N ⎜⎜ p,
⎟
n
⎝
⎠
Since E ( pˆ ) = p , p̂ is an unbiased estimator of the population proportion p.
As the sample size increases the standard deviation of p̂ decreases by a factor of 1
The standard deviation of the sample proportion is SD ( pˆ ) =
SE ( pˆ ) =
pˆ (1 − pˆ )
.
n
n.
p(1 − p )
and the standard error is
n
Exercise 1: Suppose a simple random sample of size n = 500 is obtained from a population
where a proportion p = 0.4 has a certain opinion.
(a)
(b)
(c)
(d)
Describe the sampling distribution model for the sample proportion p̂ .
Calculate P( pˆ > 0.42).
Calculate P(0.39 < pˆ < 0.41).
Redo parts (a) – (c) assuming now that n = 1000
Exercise 2: Before a local referendum, a newspaper polls 100 voters in a town. Suppose that the
referendum actually has support of 52% of the population of the town.
(a) What is the mean and standard deviation of p̂ ?
(b) What is the probability that the poll will show less than 50% support for the referendum?
(c) Suppose instead the newspaper chooses to poll 200 voters. What is the new mean and
standard deviation of p̂ ?
(d) What is the probability that the new poll will show less than 50% support for the
referendum?
(e) Suppose we want to reduce the standard deviation in (c) by one half. How large a sample
is required to achieve this goal?
Sample Proportion Drill:
For each problem below describe the sampling distribution model for the sample proportion p̂
and find within which interval
a.
b.
c.
68% of the values of p̂ are expected to lie.
95% of the values of p̂ are expected to lie.
99.7% of the values of p̂ are expected to lie.
(a) A simple random sample of size n = 500 is obtained from a population where a
proportion p = 0.5 has a certain opinion.
(b) Suppose the sample size for the problem above is n = 1000 .
(c) Suppose the sample size for the problem above is n = 2000 .
(d) A simple random sample of size n = 500 is obtained from a population where a
proportion p = 0.3 has a certain opinion.
(e) Suppose the sample size for the problem above is n = 1000 .
(f) Suppose the sample size for the problem above is n = 2000 .
Review: Sampling distribution model for means
The central limit theorem says that the sampling distribution model of the sample mean, y ,
computed from a simple random sample is approximately normal for large values of n.
Therefore, the sample mean of a simple random sample of size n drawn from a large population
with mean µ and standard deviation σ has a sampling distribution model that is approximately
⎛ σ ⎞
N⎜ µ,
⎟.
n⎠
⎝
Since E ( y ) = µ , y is an unbiased estimator of the population mean µ.
As the sample size increases the standard deviation of y decreases by a factor of 1
n.
When the standard deviation of a statistic is estimated from the data, the result is called the
standard error of the statistic.
The standard deviation of the sample mean is SD ( y ) =
SE ( y ) =
s
n
σ
n
. and the standard error is
.
Exercise 1: Suppose a simple random sample of size n = 50 is obtained from a population with
µ = 40 and standard deviation σ = 6 .
(a)
(b)
(c)
(d)
Describe the sampling distribution model of the sample mean y .
Calculate P ( y > 41).
Calculate P(39.1 < y < 40.8).
Redo parts (a) – (c) assuming now that n = 100
Exercise 2: The weight of a bag of potato chips is stated to be 10 ounces. In fact, the amount that
the machine places in each bag is thought to follow a normal model with mean 10.2 ounces and
standard deviation 0.12 ounces.
(a) What is the probability that a bag weighs less than 10 ounces?
(b) Suppose the chips are sold in packs of 3 bags. What is the probability that the mean
weight of the 3 bags is below 10 ounces?
(c) Suppose the chips are sold in packs of 20 bags. What is the probability that the mean
weight of the 20 bags is below 10 ounces?