Download ENTER STATISTICAL INFERENCE In practice parameters are

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
ESTIMATION AND CONFIDENCE INTERVALS
Up to now we assumed that we knew the parameters of the
population.
Example. Binomial experiment
Sampling from a normal population
σ.
knew probability of success p.
knew mean µ and st. dev.
In practice parameters are unknown!
ENTER STATISTICAL INFERENCE
Use
sample
data
Need to estimate parameters
and/or
Test hypotheses/make decisions about parameter values
Estimation of the population mean: σ is known
We will assume here that:
  We have normal population OR a large sample size ( n > 30).
Case 1. Sample from a normal population.
Point estimators for the normal parameters
Point estimator of µ:
Point estimator of variance σ2:
Both estimators are very good. They are unbiased and
have the smallest spread among all unbiased estimators.
CONFIDENCE INTERVALS
Often we seek a range of values – an interval – for the unknown
parameter.
We want to have some confidence that the interval covers the
parameter – CONFIDENCE INTERVAL (CI)
We construct CIs based on sample data and theoretical
knowledge about distributions, like the Central Limit Theorem
or Normal approximation to Binomial.
CONFIDENCE INTERVAL FOR THE MEAN, NORMAL POPULATION:
σ known.
Let
sample from N(µ, σ), µ unknown, σ known.
Definition: The interval (
- zα/2 σ/√n,
+ zα/2 σ/√n) is called
a 95% Confidence Interval for the mean.
Example: A 95% CI for the mean would be:
(
-1.96σ/√n,
+1.96σ/√n)
because for the 95% confidence level, zα/2 is 1.96.
Example
Suppose X, scores on a test, have a normal distribution with
unknown mean and σ = 60. The sample values X1, X2, …, X900
give sample mean
Find a 95% as well as 98%
confidence intervals for the true mean test score.
Solution. Point estimate of µ is
n=900, σ = 60.
For 95% CI, zα/2=1.96, so 95% CI is:
=(272 - 1.96(60/√900, 272 + 1.96(60/√900) = ( 268.08, 275.92).
For 98% CI, zα/2=2.33, so 98% CI is:
=(272 – 2.33(60/√900, 272 + 2.33(60/√900) = ( 267.35, 276.65).
Note that the 98% CI is longer than the 95% CI. We gained
confidence, but we lost accuracy.
What if the population is not normal?
Case 2. If the population is not normal, but the sample size is large,
then by the Central Limit Theorem
has approximately standard normal distribution no matter what the
population is.
So, provided that the sample size is large, we can use the same
confidence intervals for data sets from any population.
This is one of many practical applications of the CLT!
CONFIDENCE INTERVAL FOR MEAN WITH KNOWN σ – choice of
the sample size for a given margin of error.
Let
sample from N(µ, σ), µ unknown, σ known.
The margin of error m= half-length of the CI for µ, m =
on:
 
the confidence level via
(as conf. level
 
the variability in the population σ (as σ
 
the sample size (as n
,m
,m
,m
depends
);
);
).
To decrease the error, but keep the confidence level unchanged, we need to
increase the sample size.
For a (1-α) CI for µ (σ known) to have margin of error m, we need sample size
Example
Suppose
are heights from a normal distribution with
σ=10’’. In a sample of size 16 we obtained
.
a) Find a 95% confidence interval for µ. What is its margin of error?
b) Find the sample size needed for the margin of error to be 3 inches.
Solution. σ =10’’, n=16, m=3.
a) C = 0.95, so α = 0.05, so zα/2=1.96. 95% CI for the mean (sampling
from normal distribution) is:
=(60 +/-1.96(10)/√16) = (55.1, 64.9).
Margin of error=(64.9 - 55.1)/2 = 4.9.
b)
Finally, n = 43.
We need a larger sample size to get smaller error, with the same
confidence level.
Example: Assume that we want to estimate the mean IQ score for
the population of statistics professors. How many statistics
professors must be randomly selected for IQ tests if we want 95%
confidence that the sample mean is within 2 IQ points of the
population mean? Assume that σ = 15, as is found in the general
population.
Solution: z α/ 2 = 1.96, m
= 2, σ
= 15
With a simple random sample of only 217 statistics professors, we
will be 95% confident that the sample mean will be within 2 IQ
points of the true population mean µ.
Example
Randomly selected students from a large university participated in
an experiment to test their ability to determine when 1 min (60
seconds) has passed. Forty students yielded a sample mean of
58.3 sec. Assuming that σ=9.5 sec, construct a 95% confidence
interval for the population mean time for all students in that
university.
Solution: z α/ 2 = 1.96,
= 58.3sec. , σ
=9.5sec.
95% CI for the true mean perception time of 1 min:
The margin of error for this estimate is: 2.9
Example re time perception contd.
How many students should be sampled, so that the margin of error is
less than 1 sec., with the same confidence level?
What if we lower the confidence level to 80%? z α/ 2 = z 0.1 = 1.28
NOTE: Lowering the confidence level allows for a smaller sample size.
CONFIDENCE INTERVAL FOR PROPORTION
Binomial experiment, X = number of successes, n=number of trials,
p=probability of success is unknown.
Goal: Estimate the true/population proportion of successes p in a
Binomial experiment using an interval estimate.
Point estimate:
NOTE: For large sample size n,
Thus,
is normally distributed with
has a standard normal distribution.
LARGE SAMPLE CONFIDENCE INTERVAL FOR THE
PROPORTION
Binomial experiment, X = number of successes, n=number of trials,
p=probability of success.
A C=(1-α) confidence interval for the population proportion p is given by
if the sample size is large.
C= confidence level is the probability/proportion of times 1- α that the
confidence interval actually does contain the population parameter,
assuming that the estimation process is repeated a large number of
times. The confidence level is also called degree of confidence, or the
confidence coefficient.
NOTES: Most common choices are 90%, 95%, or 99%.
C is often expressed as percentage.
EXAMPLE
400 people were asked if they are in favor of death penalty. 250 favored
death penalty. Find a point estimate and a 95% confidence interval for
the true proportion of people who are in favor of death penalty.
Solution. X=number of people in the sample who favor death penalty,
X=250, n=400,
p= true proportion of people in the population who favor death penalty.
Point estimate of p:
95% CI: C=0.95, so zα/2=1.96, and 95% CI for p is:
The interval (0.578, 0.672) covers the true proportion of people in favor of
death penalty with probability 0.95.
SAMPLE SIZE AND MARGIN OF ERROR
Question: What is the sample size required to keep the margin of error
on the level m for a specified confidence level C=(1-α)?
Margin of error = half length of the confidence interval, m =
Solving for n, we get:
Since p is unknown, we have to substitute our best guess for p, say p0.
We can use info from previous surveys for p0. If no previous info
available, use conservative approach and plug 0.5 for p, to get:
EXAMPLE. Death penalty contd.
1. Find the sample size required for the margin of error to be 0.02 for a
90% confidence interval for the proportion of people in favor of
death penalty.
Solution. We have prior info from the survey
n=1586.
p0=0.625. Use it!
Round
UP!
2. Find the sample size required for the margin of error to be 0.02 for a
90% CI for p assuming no prior information about p.
Solution. We have no prior info about p.
Round
UP!
n=1692.
NOTE: Without prior information, we need larger sample size to have
the same margin of error.
CONFIDENCE INTERVAL FOR MEAN WITH UNKNOWN σ
If the population variance σ is unknown, estimate it using the
sample variance S:
Then, replace σ with S in the Z-statistic used for the confidence
interval:
Get t-statistic:
The t-statistic does not have standard normal distribution when n
is small. It has a t distribution with (n-1) degrees of freedom.
The number of degrees of freedom is a parameter of the t
distribution.
NOTE. T-distribution is also called “Student’s t distribution”.
T-distribution
T-distribution has similarities and differences with standard Normal
distribution: symmetric around zero, but it has fatter (heavier)
tails.
As degrees of freedom
increase, t distribution
becomes
indistinguishable from
standard normal. See
last row of t- table
CONFIDENCE INTERVAL FOR MEAN WITH UNKNOWN σ
Following a procedure similar to the one for constructing CI for µ
when σ was known, when data was from a normal population, we
can find a CI for µ when σ is not known. We replace σ with S and
standard normal percentiles with percentiles from t distribution.
Let
be observations from a normal distribution with
mean µ and standard deviation σ, both µ and σ unknown.
A C=(1-α) confidence interval for µ is given by
when the data is from a normal population with σ unknown.
Here tα/2 satisfies P(t(n-1)>tα/2 )=α/2, i.e. tα/2 is a percentile from t(n-1)
distribution.
We use t-table or a statistical computer package (e.g. MINTAB) to
find percentiles of t-distribution: tα/2.
EXAMPLE
A biologist studying brain weights of tigers took a random sample of
16 animals and measured their brain weights in ounces. This data
gave a sample mean of 10 and standard deviation of 3.2 ounces.
Assuming that these weights follow a Normal distribution, find a
95% confidence interval for the true mean tiger brain weight µ.
Solution.
=10, s=3.2, n=16, need 95%CI for µ.
C=0.95, so α=0.05, so α/2=0.025. Since n=16, then n-1=15.
We need t(15)0.025. From the table t(15)0.025=2.131, so the 95% CI for µ is:
10+/- 2.131 (3.2)/√16 = (8.295, 11.705) oz.
We are 95% confident that the true mean weight of a tiger brain is
between 8.295 and 11.705 oz.
Example: A study found the body temperatures of 106 healthy
adults. The sample mean was 98.2 degrees and the sample standard
deviation was 0.62 degrees. Find the margin of error E and the 95%
confidence interval for µ.
Solution: n = 106,
= 98.20o, s = 0.62o, α = 0.05,
α /2 = 0.025, degrees of freedom=105, t α/ 2 = 1.984 , (Table A-3
with df=100)
Margin of error= 1.984 (0.62)/√106 =0.1195
95% CI for the mean temperature:
98.2+/- 1.984 (0.62)/√106 = (98.08, 98.32)
Related documents