Download Confidence Intervals and Sample Size

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Sufficient statistic wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Confidence Intervals
and Sample Size
Estimates
• Point Estimate: A specific numerical value estimate of a
parameter. The best point estimate of the population mean
is the sample mean.
• Example: Want to estimate the age of teachers in SHS. All
teachers are surveyed and answer is 38.6 years. This is a
point estimate.
• Interval Estimate: An interval or a range of values used to
estimate a parameter. This estimate may or may not contain
the value of the parameter being estimated.
• Preferred due to the fact that the sample mean, for the most
part, is somewhat different from population mean due to
sampling errors.
• Example: Avg. Age might be 38.1 < 𝜇 < 39.1 which is 38.6
± 0.5 years.
Properties of Good Estimators
• Estimator must be an unbiased estimator. The
expected value or mean of the estimates obtained
from samples is equal to the parameter being
estimated.
• Estimator must be consistent. As sample size
increases, the value of the estimator approaches
the value of the parameter estimated.
• Estimator must be relatively efficient. Of all the
statistics that can be used to measure a parameter,
the relatively efficient estimator has the smallest
variance.
Confidence Intervals
• The probability of being correct can be assigned before
an interval estimate is made.
• For example: May want to be 90%, 95%, or even 99%
sure that the interval contains the true population mean.
The larger your confidence gets the larger your interval
must be.
• Confidence Interval: A specific interval estimate of a
parameter determined by using data obtained from a
sample and the specific confidence level of the
estimate.
• Confidence level: The probability that the interval
estimate will contain the parameter.
Confidence Interval Formula
for a specific ∝
Z sub alpha over 2
• ∝ : Represents the total area under both tails of
the standard normal distribution curve.
• ∝ ∕ 2: Represents the area in each one of the tails.
• Relationship between ∝ and confidence level is
that the confidence level is the percentage
equivalent to the decimal value of 1 - ∝, and vice
versa.
• The second term of our CI Formula is called the
maximum error of estimates.
• Max error of estimates: The max difference
between the point estimate of a parameter and the
actual value of the parameter.
Steps to find Z sub alpha over 2.
• Step 1: Subtract decimal value of percent
from 1. This creates your ∝.
• Step 2: Take ∝ and divide by 2. This creates
area of each tail.
• Step 3: Subtract area from 0.5.
• Step 4: Find the corresponding area from
Step 3 on our Table E chart.
• Step 5: Use that z-value in our CI formula.
CI Examples
• Example 1: Dr. Purnell wishes to find avg. age of
teachers in the district. The standard deviation is
known to be 4 years. A sample of 55 teachers had
an average age of 32.5 years. Find the 95%
confidence interval of the population mean.
• Example 2: Avg. annual wind speed in Kitty
Hawk NC is 15.6 mph. If a sample of 90 days
was used to determine the average, find the 99%
confidence interval of the mean. Assume standard
deviation to be 2.3 mph.
Determining Sample Size
• Sample size determination is closely related to
statistical estimation.
• The formula is derived from the Maximum
Error of Estimate formula.
• All answers should be rounded up if there is
any fractional or decimal portion in our
answer.
• Formula: n =
𝑍 𝑠𝑢𝑏 𝑎𝑙𝑝ℎ𝑎 𝑜𝑣𝑒𝑟 2 ● 𝜎
𝐸
²
Sample Size Examples
• Example 1: Researcher is interested in estimating
avg. salary of garbage men in a large town. He
wants to be 95% sure that estimate is correct. If
standard deviation is $950 how large a sample is
needed to get the desired info. and to be
accurate within $150?
• Example 2: A nurse wants to estimate birth
weights of babies. How large must sample be if
she desires to be 90% confident that the true
mean is within 8 ounces of the sample? Standard
deviation is known to be 7 ounces.
The T-Distribution
Similarities to Normal Dist.
• Bell shaped.
• Symmetrical about the
mean.
• The mean, mode, and
median are equal to 0 and
located at the center.
• Never touches x-axis.
Differences to Normal Dist.
• The variance is greater than
1.
• It is actually a family of
curves based on the concept
of degrees of freedom,
which is related to sample
size.
• As sample size increase it
approaches the standard
normal dist. curve.
d.f. = Degrees of Freedom
• d.f. : The number of values that are free to vary
after a sample statistic has been computed.
• They tell the researcher what curve to use when a
distribution consists of a family of curves.
• Example: Lets say mean of 10 values is 60. This
means that 9 out of 10 values are free to vary.
Once they have been selected the last value must
be a specific number to get a sum of 600 since
600 ÷ 10 = 60. Hence d.f. = n – 1.
Using Table F
• Need to find the correct value of t sub alpha over
2.
• Step 1: Find the correct d.f along the left hand
side. ( d.f. = n – 1 )
• Step 2: Find correct confidence level on top.
• Step 3: Intersection becomes our answer.
• Step 4: Answer will be used in CI formula on
next slide.
• Do not need to worry about “one tail” or “two
tails”.
Confidence Interval Formula for when
𝝈 is unknown and n < 30
T-Distribution Examples
• Example 1: For a group of 20 students taking
a final exam the mean heart rate was 96 beats
per minute. Standard deviation was 5. Find the
95% confidence interval of the true mean.
• Example 2: A sample of 12 food servers
showed an avg. weekly income of $340.40
with a standard deviation of $11. Find the
98% confidence interval of the true mean.
Confidence Interval for a
proportion
• As with means, the statistician, given the
sample population, tries to estimate the
population proportion.
• An interval estimate can be used for a
proportion.
• The formula is given below:
The Symbols for
Proportion Notation
• “p hat” = X ÷ n, Where X = number of sample
units that possess the characteristics of
interest and n = sample size.
• “q hat” = 1 – “p hat”
Rules for using CI for a
proportion
• #1: n ● p and n ● q must be greater than or
equal to 5. Just like binomial check.
• #2: Round off to 3 decimal
places.
CI Example for Proportions
• Example 1: In a recent study of 100 people, 78
said that they were satisfied with their current
home. Find the 90% confidence interval of the
true proportion of individuals who are satisfied
with their current home.
• Example 2: A nutritionist found that in a survey
of 60 families, 32% said they ate apples at least
once a week. Find the 95% confidence interval of
the true proportion of families who eat apples at
least once per week.
Minimum sample size for Interval
Estimate of a Population Proportion.
• It is necessary to round up to obtain a whole
number answer. No fractional or decimal
answers allowed.
• Formula:
• If no p “hat” is known must use 0.5 for both p
“hat” and q “hat”.
Minimum Examples for Pop.
Proportion
• Example 1: A researcher wishes to estimate, with
98% confidence , the number of people who own
an iphone. A previous study shows that 42% of
those interviewed had an iphone. The researcher
wishes to be accurate within 3% of the true
proportion. Find the minimum sample size.
• Example 2: The same researcher wishes to
estimate the proportion of people who also own
an Ipad. She wants to be 95% confident and
accurate within 7% of the true proportion. Find
the minimum sample size.
CI for Variances and Standard
Deviation
• Variances and standard deviations are just as
important as means.
• Example: The variance and standard deviation of
the medication in a certain prescription plays an
important role in making sure the patients gets the
proper dosage.
• Due to fact that they are both rather important
confidence intervals are necessary.
• To calculate these intervals a new distribution is
needed called the chi-square distribution.
Chi-Square Distribution
• Similar to the t-distribution in the fact that it
too is a family of curves based on d.f.
• Symbol for chi-square is 𝜒². (Pronounced
“ki”)
• Chi-square variable can not be negative and
distributions are positively skewed. At
roughly 100 d.f. the distribution becomes
somewhat symmetrical.
Chi-Square Distribution
How to read chi-square table.
• There are two different values that are going to be used
in the formulas for variance and standard deviations.
Need to find those 2 numbers first.
• Step 1: Need to find ∝ first by subtracting 1- CI.
• Step 2: Divide answer above by 2. Use that ∝ ÷ 2
answer and match it to the d.f. Their intersection
creates 𝜒²right.
• Step 3: Take answer from ∝ ÷ 2 and subtract it from 1.
Use that number and the d.f. intersection to create
𝜒²left.
• Step 4: Answers from steps 2 and 3 will be used to find
confidence intervals for variances and standard
deviations.
Chi-Square Distribution
Example
• Find the values of 𝜒²right and 𝜒²left for a 95%
confidence interval when n = 18.
• Step 1: Need to find ∝ first by subtracting 1- CI.
• Step 2: Divide answer above by 2. Use that
∝ ÷ 2 answer and match it to the d.f. This creates
𝜒²right.
• Step 3: Take answer from ∝ ÷ 2 and subtract it
from 1. Use that number and the d.f. intersection.
This creates 𝜒²left.
Formulas for CI for Variances and
Standard Deviations
• CI for Variance:
• CI for Standard Deviation:
• Remember s = sample standard deviation and s² = sample
variance. Problem could give us either so if deviation is given
need to square it. If variance is given plug directly in.
CI Variance and Standard
Deviation Examples
• Find the 99% CI for the variance and standard
deviation of the weights of 5 gallon containers of
paint if a sample of 14 containers has a standard
deviation of 1.2 pounds. Assume the variable is
normally distributed.
• Find the 90% CI for the variance and standard
deviation for the lifetime of batteries if a sample
of 25 batteries has a standard deviation of 2.1
months. Assume the variable is normally
distributed.