Download Parameter Estimation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Statistical Fundamentals:
Using Microsoft Excel for Univariate and Bivariate Analysis
Alfred P. Rovai
Parameter Estimation
PowerPoint Prepared by
Alfred P. Rovai
Microsoft® Excel® Screen Prints Courtesy of Microsoft Corporation.
Presentation © 2015 by Alfred P. Rovai
Parameter Estimation
Parameter estimation is a
way to estimate a
population parameter based
on measuring a sample. It
can be expressed in two
ways:
A point estimate of a
population parameter is a
single value of a statistic,
e.g., the sample mean x̄ is a
point estimate of the
population mean μ.
An interval estimate, e.g.,
confidence interval, is
defined by two numbers,
between which a population
parameter lies within a
specified confidence level.
Presentation © 2015 by Alfred P. Rovai
Point Estimates vs. Interval Estimates
Polling is a common method of estimating population parameters.
• The sample mean x̄ is the best point estimate of the
population mean μ.
• The sample proportion p of x successes in a random sample
of n observations is the best point estimate of the population
^
proportion p.
However, point estimates provide no measure of reliability
Confidence intervals, on the other hand, provide a level of
confidence.
Presentation © 2015 by Alfred P. Rovai
Estimating Confidence Intervals
• A confidence interval is an estimated range of
values that is likely to include an unknown
population parameter.
• Confidence intervals are constructed at a
confidence level, e.g., 95%, selected by the
statistician.
– If a population is sampled repeatedly and interval
estimates are made on each occasion, the resulting
intervals will reflect the true population parameter in
approximately 95% of the cases.
– This example corresponds to hypothesis testing with p =
0.05; that is, a 0.05 significance level where  = 0.05.
Presentation © 2015 by Alfred P. Rovai
Steps for Calculating the Confidence Interval
for an Unknown Population Parameter
1
• Obtain the point estimate of the
parameter. This is usually the sample
mean or sample proportion.
2
• Select a confidence level, e.g., 95% ( =
0.05).
3
• Calculate the confidence interval for the
unknown population parameter.
Presentation © 2015 by Alfred P. Rovai
Calculating the Confidence Interval (CI) for μ
When σ Is Known
Assumptions
• Population σ and sample x̄ are known.
General formulas
CI = Point Estimate ± Margin of Error (i.e., Sampling Error)
CI = x̄ ± (Critical Value)*(Standard Error)
Calculating formula
CI = X±C
s
N
or
X - C(
s
n
) < m < X + C(
s
n
)
where C = critical value for the required CI in standard deviation units (z-scores).
Presentation © 2015 by Alfred P. Rovai
Critical Values
Use the normal distribution to calculate critical values
• 90% CI =NORM.S.INV(1-0.10/2) = 1.645 (90% of the
area of a normal distribution is within 1.96 standard
deviations of the mean).
• 95% CI =NORM.S.INV(1-0.05/2) = 1.96 (95% of the
area of a normal distribution is within 1.96 standard
deviations of the mean).
• 99% CI =NORM.S.INV(1-0.01/2) = 2.58 (99% of the
area of a normal distribution is within 2.58 standard
deviations of the mean).
Presentation © 2015 by Alfred P. Rovai
Example: 95% CI, n = 100, x̄ = 50, σ = 10
95% CI = 50 ± 1.96*(10/√100) = 50 ± 1.96 = (48.04,
51.96)
Margin of error
=CONFIDENCE.NORM(alpha,standard_dev,size)
=CONFIDENCE.NORM(0.05,10,100) = 1.96
Presentation © 2015 by Alfred P. Rovai
Example Continued
The margin of error for the previous example is 1.96 units. What is
the required sample size to be 95% confident that the estimate is
within 1 unit of the true mean?
Solution
2
s
2
2
10
2
n = z 2 =1.96 2 = 384.16
D
1
The required sample size is 385.
Presentation © 2015 by Alfred P. Rovai
Example: 95% CI, n = 100, σ = 10
We are 95% confident that the true
population mean is between 48.04 and 51.96
Although we cannot be certain (i.e., 100%
confident) that the true mean is in this
interval, 95% of intervals formed by taking
random samples from the target population in
this manner will contain the true mean.
Presentation © 2015 by Alfred P. Rovai
Calculating the Confidence Interval (CI) for μ
When σ Is Unknown
• If the population standard deviation σ is unknown, use the sample standard
deviation s in calculating CI.
– This procedures increases uncertainty, since s varies from sample to
sample.
• Use the student’s t distribution instead of the normal z distribution to
calculate margin of error. The t distribution is similar to normal distribution
except that it adjusts for smaller sample sizes. As n becomes large, the t
distribution approaches the shape of a normal distribution.
Margin of Error
=CONFIDENCE.T(alpha,standard_dev,size)
where size = sample size
Presentation © 2015 by Alfred P. Rovai
Calculating the Confidence Interval (CI) for an
Unknown Population Proportion p
• Sample proportion ^p = x/n is the best point estimate of the
population proportion p where x = number of successes in sample
size n.
• 95% CI for p
p̂(1- p̂)
p̂±1.96
n
Presentation © 2015 by Alfred P. Rovai
Example
Question: Overall, how much do you feel you can trust the
government in Washington to do what’s right?
Reported Poll Results
95% CI Calculation
Can trust = 39, n = 39 + 60 + 1 = 100, p^ = 39/100 = .39
.39(1-.39)
.39±1.96
= .39±1.96(.0486) = .39±.0953
100
Therefore, the interval (.295, .485) captures p 95% of the time.
Presentation © 2015 by Alfred P. Rovai
Example Continued
The margin of error for the previous example is 9.53%. What is the
required sample size to be 95% confident that the estimate is within
3% of the correct percentage?
Solution
z2
1.96 2
n=
=
=1067.11
2
2
4D
4(.03)
The required sample size is 1068.
Presentation © 2015 by Alfred P. Rovai
Summary
• Commonly used confidence level multipliers (critical values)
– 99% confidence level multiplier = 2.58.
– 95% confidence level multiplier = 1.96.
– 90% confidence level multiplier = 1.645.
• The higher the confidence level, the wider the CI if all else
remains constant.
• Increasing the random sample of n observations will make a CI
with the same confidence level narrower (i.e., more precise) if all
else remains constant.
Presentation © 2015 by Alfred P. Rovai
Parameter
Estimation
End of
Presentation
Copyright 2015 by Alfred P. Rovai