Download Confidence Limits - Sys

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Confidence Intervals
1
Terminology Reminders
• Population: any collection of entities that have at least one characteristic in
common
• Parameter: the numbers that describe characteristics of scores in the
population (mean, variance, s.d., etc.)
• Sample: a part of the population
• Statistic: the numbers that describe characteristics of scores in the sample
(mean, variance, s.d., correlation coefficient, reliability coefficient, etc.
Terminology Reminders
• Estimate: a number computed by using the data collected from a
sample
• Estimator: formula used to compute an estimate
Quantity
Mean
Variance
Standard Deviation
Statistic (Sample)
Parameter (Population)
Statistical Methods
Statistical
Methods
Descriptive
Statistics
Inferential
Statistics
Estimation
Hypothesis
Testing
Estimation
One of the aims of statistics is the estimation of
properties of populations
Estimations Lead to Inferences
Sample
Sampling Procedure
Population
Calculation of sample mean
and sample standard deviation
Inference about Population
Point and Interval Estimation
• Point Estimate – A sample statistic used to estimate
the value of a population parameter
• Confidence interval (interval estimate) – A range of
values defined by the confidence level within which
the population parameter is estimated to fall.
• Confidence Level – The likelihood, expressed as a
percentage or a probability, that a specified interval
will contain the population parameter.
Point Estimation
1. Provides Single Value
Based on observations from 1 Sample, there is no sampling distribution
2. Information
Gives no information about how close the value is to the unknown population
parameter
3. Example:
Sample Mean 𝑥 is the point estimate of unknown population mean
Interval Estimation
1. Provides Range of Values
2. Gives Information about Closeness to Unknown Population Parameter
3. Example: Unknown population mean lies between 50 & 70 with 95% confidence
Key Elements of Interval Estimation
A Probability that the population parameter falls somewhere within the Interval.
Confidence Interval
Confidence Limit
(Lower)
Sample Statistic
(Point Estimate)
Confidence Limit
(Upper)
Key Elements of Interval Estimation
Inferential Statistics involves
Three Distributions:
A population distribution – variation in the larger
group that we want to know about.
A distribution of sample observations – variation in
the sample that we can observe.
A sampling distribution – a normal distribution whose
mean and standard deviation are unbiased estimates of
the parameters and allows one to infer the parameters
from the statistics.
Confidence Limits
Confidence interval (interval estimate) – A range of values defined
by the confidence level within which the population parameter is
estimated to fall.
Confidence Level – The likelihood, expressed as a percentage or a
probability, that a specified interval will contain the population
parameter.
Confidence Levels:
• Confidence Level – The likelihood, expressed as a percentage or a
probability, that a specified interval will contain the population
parameter.
• 95% confidence level – there is a .95 probability that a specified interval
DOES contain the population mean. In other words, there are 5 chances
out of 100 (or 1 chance out of 20) that the interval DOES NOT contain the
population mean.
• 99% confidence level – there is 1 chance out of 100 that the interval DOES
NOT contain the population mean.
Confidence Levels:
Constructing a Confidence Interval (CI)
• The sample mean is the point estimate of the population
mean.
• The sample standard deviation is the point estimate of
the population standard deviation.
• The standard error of the mean makes it possible to state
the probability that an interval around the point estimate
contains the actual population mean.
Go Back to Meaning of Standard Error
The standard error of the mean can be interpreted as:
If we were to take another sample from the population and
computed its mean, there is a 68% chance that the mean
of the sample would lie within 1 standard error of the
population mean.
Distribution of means
𝑥
We can do better than that
If we were to take another sample from the population and
computed its mean, there is a x% chance that the mean
of the sample would lie within y standard errors of the
population mean.
2.5%
Distribution of means
𝑥
2.5%
We can do better than that
If we were to take another sample from the population and
computed its mean, there is a x% chance that the mean
of the sample would lie within y standard errors of the
population mean.
2.5%
Distribution of means
𝑥
2.5%
Level of Confidence
1.
2.
Probability that the unknown population parameter falls within interval
Denoted (1 - a) %
a Is the probability that the parameter is Not within the interval
3.
Typical values for (1- a)% are 99%, 95%, 90%
Consider a 95% confidence interval:
1  a  0.95
a  0.05
0.475
0.475
α
 0.025
2
Z= -1.96
Lower
Confidence
Limit
μL
a / 2  0.025
0
Point Estimate
α
 0.025
2
Z= 1.96
Z
Upper
Confidence
Limit
μU
μ
Process for Constructing Confidence Intervals
• Compute the sample statistic (e.g. a mean)
• Compute the standard error of the mean
• Make a decision about level of confidence that is
desired (usually 95% or 99%)
• Find tabled value for 95% or 99% confidence
interval
• Multiply standard error of the mean by the tabled
value
• Form interval by adding and subtracting calculated
value to and from the mean
Interpretation
A 95% confidence interval means that if one were to take another
sample from the population, then 95% of the time, the mean of the
sample would lie between the confidence intervals.
Example: Estimation of the mean
The mean of a random sample of n = 25 is`X = 50. Set up a 95%
confidence interval estimate for mX if the population sX = 10.
Exercise:
The mean birth weights for 200 babies is 3.28 Kg grams with a population standard
deviation of 0.85 Kg. Compute the 95% confidence limits for the mean birthweight.
Exercise:
The mean birth weights for 200 babies is 3.28 Kg grams with a population standard
deviation of 0.85 Kg. Compute the 95% confidence limits for the mean birthweight.
𝜇 = 3.28 ± 1.96
0.85
200
= 3.28 ± 0.12
Exercise:
The mean concentration for a sample of 100 insulin vials is 15 grams/vial with a
population standard deviation of 3.4 grams. Compute the 90% confidence limits
for the mean concentration of insulin.
Note 90% means 5% on each side, hence
look up 0.95 in the z table.
Exercise:
The mean concentration for a sample of 100 insulin vials is 15 grams/vial with a
population standard deviation of 3.4 grams. Compute the 90% confidence limits
for the mean concentration of insulin.
Note 90% means 5% on each side, hence
look up 0.95 in the z table.
𝜇 = 15 ± 1.65
3.4
100
= 15 ± 0.56
There is One Little Problem
It doesn’t work very well. Reread the question……
“The mean birth weights for 200 babies is 3.28 Kg grams with a population standard
deviation of 0.85 Kg. Compute the 95% confidence limits for the mean birthweight. “
Can you spot the problem which makes the question practically unanswerable. ?
There is One Little Problem
It doesn’t work very well. Reread the question……
“The mean birth weights for 200 babies is 3.28 Kg grams with a population standard
deviation of 0.85 Kg. Compute the 95% confidence limits for the mean birthweight. “
Can you spot the problem which makes the question practically unanswerable ?
We don’t actually have the population standard deviation, no one has the weights
of every baby in the world.
Instead what we actually have is more likely the sample standard deviation.
By using the sample standard deviation we introduce some error and the intervals
are under estimated – to fix this we have to use a different distribution
Standardizing 𝒙
Just as we can standardize a normal distribution we can also standardize
the distribution of means:
95% of the z values will fall between -1.96 and +1.96. However it assumes we know 𝜎
Because we don’t know 𝜎 we will often substitute it for the sample standard deviation, s.
The problem is that
is no longer normally distributed.
This is particularly a problem when n < 30
31
Standardizing 𝒙
The distribution of
was discovered by Gossett under the pen name student
And since then it has been called the student’s t distribution.
The shape of the t-distribution depends on n.
32
Student’s t-distribution
When the population variance is unknown and the sample is random, the
distribution that correctly describes the sample mean is known as the tdistribution.
• The t-distribution has larger reliability (cutoff) values for a given level of alpha than the
normal distribution, but as the sample size increases, the cutoff values approach those
of the normal distribution.
 For small sample sizes, use of the t-distribution
instead of the z-distribution to determine
reliability factors is critical.
• The t-distribution is a symmetrical distribution
whose probability density function is defined
by a single parameter known as the
degrees of freedom (df).
33
Student’s t-distribution
• t distribution is symmetrical around its mean of zero, like Z dist.
• Compare to Z dist., a larger portion of the probability areas are in the tails.
• As n increases, the t dist. approached the Z dist.
• t values depends on the degree of freedom.
34
Degrees of freedom
The parameter that completely characterizes a t-distribution.
• The degrees of freedom for a given t-distribution are equal to the sample
size minus 1.
For a sample size of 45, the degrees of freedom are 44.
35
Confidence Limits for Small Samples
where t is the critical value of the t distribution with n-1 d.f. and an area of α/2 in each tail)
Example:
6 random vials of penicillin were selected and the concentration of penicillin was
determined in each vial in mg/ml
8.6, 9.7, 13.4, 11.4, 10.2, 12.3
Find the 95% confidence limits for the true mean concentration of penicillin.
Confidence Limits for Small Samples
8.6, 9.7, 13.4, 11.4, 10.2, 12.3
Find the 95% confidence limits for the true mean concentration of penicillin.
Since we are dealing with less than a sample size of 30 we will use the t-statistic to determine
the confidence limits.
n=6
Mean = 10.93
Sample standard deviation = 1.77
Confidence Limits for Small Samples
n=6
Mean = 10.93
Sample standard deviation = 1.77
df = 6 – 1 = 5
Confidence level = 95%
𝑡5,95% = 2.571
Therefore 𝜇 = 10.93 ± 2.571
1.77
√6
= 10.93 ±1.86
Compare the normal Z statistic at 95% of 1.96
End
Estimation
Question:
1. How many parameters does it require to describe a binomial
distribution?
2. How many parameters does it require to describe a normal
distribution?
Estimation
Question:
1. How many parameters does it require to describe a binomial
distribution?
Number of trials n, and probability of success in a single trial, p
2. How many parameters does it require to describe a normal
distribution?
Mean and standard deviation
Related documents