Download Confidence Intervals & Sample Size

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Sufficient statistic wikipedia , lookup

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Confidence Intervals
& Sample Size
Chapter 7
Introduction
• One aspect of inferential statistics is estimation, which is the
process of estimating the value of a parameter from
information obtained from a sample
• One important question exists:
• How large should the sample be in order to make an accurate
estimate?
7.1 – Confidence Intervals for
the Mean When σ is Known
• Point estimate:
• Specific numerical value estimate of a parameter
• The best point estimate of a population mean μ is the sample
mean 𝑋
Three Properties of a Good Estimator
1. The estimator should be an unbiased estimator
2. The estimator should be consistent estimator
3. The estimator should be a relatively efficient estimator
Interval Estimates
• Interval estimate
• Interval or a range of values used to estimate the parameter
• This estimate may or may not contain the value of the parameter
being estimated
• Confidence level
• Probability that the interval estimate will contain the parameter,
assuming that a large number of samples are selected and that
the estimation process on the same parameter is repeated
• Confidence interval
• Determined by using data obtained from a sample and by using
the specific confidence level of the estimate
Confidence Interval Formula
• Three common confidence levels exist: 90, 95, and 99%
Formula for the Confidence Interval of the Mean
𝑋 − 𝑧𝛼
𝜎
2
𝑛
< 𝜇 < 𝑋 + 𝑧𝛼
𝜎
2
𝑛
• Maximum error of the mean:
• Maximum likely difference between the point estimate of a
parameter and the actual value of the parameter
• 𝑧𝛼 2 𝜎 𝑛 is the maximum error of the mean (also called margin
of error)
• Rounding rule:
• Round off to one more decimal place than original data
Examples
• 7–1
• A researcher wishes to estimate the number of days it takes an
automobile dealer to sell a Chevrolet Aveo. A sample of 50 cars
had a mean time on the dealer’s lot of 54 days. Assume the
population standard deviation to be 6.0 days. Find the best point
estimate of the population mean and the 95% confidence interval
of the population mean.
Example
• 7–2
• A survey of 30 adults found that the mean age of a person’s
primary vehicle is 5.6 years. Assuming the standard deviation of
the population is 0.8 year, find the best point estimate of the
population mean and the 99% confidence interval of the
population mean.
Examples
• 7–3
• The following data represent a sample of the assets (in millions of
dollars) of 30 credit unions in southwestern Pennsylvania. Find
the 90% confidence interval of the mean. (use calculators)
12.23
13.19
7.92
1.24
3.17
12.77
1.06
12.24
16.56
73.25
40.22
9.16
4.78
2.17
18.13
2.76
4.39
11.59
5.01
1.91
2.42
1.42
16.85
2.89
8.74
2.27
6.69
1.47
14.64
21.58
Sample Size
• One important question exists…
• How large a sample is necessary to make an accurate estimate?
Formula for the Minimum Sample Size Needed for an Interval
Estimate of the Population Mean
𝑧𝛼 ∗ 𝜎 2
2
𝑛=
𝐸
• Where E is the maximum error of the estimate. If necessary,
round the answer up to obtain a whole number.
Examples
• 7–4
• A scientist wishes to estimate the average depth of a river. He
wants to be 99% confident that the estimate is accurate within 2
feet. From a previous study, the standard deviation of the depths
measured was 4.38
7.2 – Confidence Intervals
when σ is Unknown
• Most of the time the population standard deviation (σ) is not
known
• It is estimated using the sample standard deviation (s)
• Values are taken from the t distribution
• The t distribution was formulated in 1908 by an Irish brewing
employee named W.S. Gosset
Characteristics of the
t Distribution
• The t distribution shares some characteristics of the normal
distribution and differs it in others.
The t distribution is similar to the normal distribution in these ways:
1. Bell-shaped
2. Symmetric about the mean
3. Mean, median, and mode are equal to 0 and are located at the
center of the distribution
4. Curve never touches the x axis
The t distribution differs from the normal distribution in these ways:
1. Variance is greater than 1
2. t distribution is a family of curves based on concept of degrees of
freedom which is related to sample size
3. As sample size increases, t distribution approaches normal
distribution
Degrees of Freedom
• Degrees of freedom
• Number of values that are free to vary after a sample statistic has
been computed
• Tell the researcher which specific curve to use
• The symbol d.f. is used for degrees of freedom
• Degrees of freedom for a confidence interval for the mean are
found by subtracting 1 from the sample size: d.f. = n – 1
• When d.f. is greater than 30, it may fall between two table
values
• Always round down to the nearest table value
• Example: 68 is closer to 70, but round down to table value of 65
Formula
Formula for a Specific Confidence Interval for the Mean When
σ is Unknown and n < 30
𝑋 − 𝑡𝛼
2
𝑠
< 𝜇 < 𝑋 + 𝑡𝛼
𝑛
• The degrees of freedom are n – 1
2
𝑠
𝑛
Examples
• 7–5
• Find the ta/2 value for a 95% confidence interval when the sample
size is 22
• 7–6
• Ten randomly selected people were asked how long they slept at
night. The mean time was 7.1 hours, and the standard deviation
was 0.78 hour. Find the 95% confidence interval of the mean
time. Assume the variable is normally distributed.
Example 7 – 7
• The data represent a sample of the number of home fires
started by candles for the past several years. Find the 99%
confidence interval for the mean number of home fires
started by candles each year.
5460 5900
6090
6310
7160
8440
9930
7.3 – Confidence Intervals and
Sample Size for Proportions
• Proportion
• Represents a part of a whole expressed as fraction, decimal, or
percentage
• Proportions can also represent probabilities
Symbols Used in Proportion Notation
𝑝 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛
𝑝 𝑟𝑒𝑎𝑑 p hat = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛
For a sample proportion:
𝑝=
𝑋
𝑛
and
𝑞=
𝑛−𝑋
or
𝑛
𝑞 =1−𝑝
where X = number of sample units that posses the characteristics of
interest and n = sample size
Example 7 – 8
• In a recent survey of 150 households, 54 had central air
conditioning. Find p hat and q hat where p hat is the
proportion of households that have central air conditioning.
Formula
• Maximum error of the estimate is
𝐸 = 𝑧𝛼
2
𝑝𝑞
𝑛
• Confidence intervals about proportions must meet the criteria
that 𝑛𝑝 ≥ 5 𝑎𝑛𝑑 𝑛𝑞 ≥ 5
Formula for a Specific Confidence Interval for a Proportion
𝑝 − 𝑧𝛼
2
𝑝𝑞
< 𝑝 < 𝑝 + 𝑧𝛼
𝑛
2
𝑝𝑞
𝑛
Examples
• 7–9
• A sample of 500 nursing applications included 60 from men. Find
the 90% confidence interval of the true proportion of men who
applied to the nursing program.
• 7 – 10
• A survey of 1721 people found that 15.9% of individuals purchase
religious books at a Christian bookstore. Find the 95% confidence
interval of the true proportion of people who purchase their
religious books at a Christian bookstore.
Sample Size for Proportions
• To find the sample size needed to determine a confidence
interval about a proportion use:
Formula for Minimum Sample Size Needed for Interval
Estimate of a Population Proportion
𝑛 = 𝑝𝑞
𝑧𝛼
2
2
𝐸
• If necessary, round up to obtain a whole number
Examples
• 7 – 11
• A researcher wishes to estimate, with 95% confidence, the
proportion of people who own a home computer. A previous
study shows that 40% of those interviewed had a computer at
home. The researcher wishes to be accurate within 2% of the true
proportion. Find the minimum same size necessary.
• 7 – 12
• The same researcher wishes to estimate the proportion of
executives who own a car phone. She wants to be 90% confident
and be accurate within 5% of the true proportion. Find the
minimum sample size necessary.
7.4 – Confidence Intervals for
Variances & Stand. Deviations
• In statistics, the variance and standard deviation of a variable
are as important as the mean
• For example:
• Products that fit together (such as pipes) are manufactured so
that variations in diameters are as small as possible
Chi-Square Distribution
• To calculate these confidence intervals, a new statistical
distribution is needed
• Chi-square distribution
• Similar to the t variable in that its distribution is a family of curves
based of the number of degrees of freedom
• Symbol for chi-square is 𝜒 2
• Chi-square variable cannot be negative
• Area under the chi-square distribution is 1.00 or 100%
Example 7 – 13
• Find the values of 𝜒 2 𝑟𝑖𝑔ℎ𝑡 and 𝜒 2 𝑙𝑒𝑓𝑡 for a 90% confidence
interval when n = 25
Formulas
• Formulas for confidence intervals where d.f. = n – 1
Formula for the Confidence Interval for a Variance
2
(𝑛 − 1)𝑠 2
(𝑛
−
1)𝑠
2 <
<
𝜎
𝜒𝑟𝑖𝑔ℎ𝑡 2
𝜒𝑙𝑒𝑓𝑡 2
Formula for the Confidence Interval for a Standard Deviation
(𝑛 − 1)𝑠 2
<𝜎<
2
𝜒𝑟𝑖𝑔ℎ𝑡
(𝑛 − 1)𝑠 2
𝜒𝑙𝑒𝑓𝑡 2
Examples
• 7 – 14
• Find the 95% confidence interval for the variance and standard
deviation of the nicotine content of cigarettes manufactured if a
sample of 20 cigarettes has a standard deviation of 1.6
milligrams.
• 7 – 15
• Find the 90% confidence interval for the variance and standard
deviation for the price in dollars of an adult single-day ski lift
ticket. The data represent a selected sample of nationwide ski
resorts. Assume the variable is normally distributed.
59
54
53
52
51
39
49
46
49
48