Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Confidence Intervals 1 Terminology Reminders • Population: any collection of entities that have at least one characteristic in common • Parameter: the numbers that describe characteristics of scores in the population (mean, variance, s.d., etc.) • Sample: a part of the population • Statistic: the numbers that describe characteristics of scores in the sample (mean, variance, s.d., correlation coefficient, reliability coefficient, etc. Terminology Reminders • Estimate: a number computed by using the data collected from a sample • Estimator: formula used to compute an estimate Quantity Mean Variance Standard Deviation Statistic (Sample) Parameter (Population) Statistical Methods Statistical Methods Descriptive Statistics Inferential Statistics Estimation Hypothesis Testing Estimation One of the aims of statistics is the estimation of properties of populations Estimations Lead to Inferences Sample Sampling Procedure Population Calculation of sample mean and sample standard deviation Inference about Population Point and Interval Estimation • Point Estimate – A sample statistic used to estimate the value of a population parameter • Confidence interval (interval estimate) – A range of values defined by the confidence level within which the population parameter is estimated to fall. • Confidence Level – The likelihood, expressed as a percentage or a probability, that a specified interval will contain the population parameter. Point Estimation 1. Provides Single Value Based on observations from 1 Sample, there is no sampling distribution 2. Information Gives no information about how close the value is to the unknown population parameter 3. Example: Sample Mean 𝑥 is the point estimate of unknown population mean Interval Estimation 1. Provides Range of Values 2. Gives Information about Closeness to Unknown Population Parameter 3. Example: Unknown population mean lies between 50 & 70 with 95% confidence Key Elements of Interval Estimation A Probability that the population parameter falls somewhere within the Interval. Confidence Interval Confidence Limit (Lower) Sample Statistic (Point Estimate) Confidence Limit (Upper) Key Elements of Interval Estimation Inferential Statistics involves Three Distributions: A population distribution – variation in the larger group that we want to know about. A distribution of sample observations – variation in the sample that we can observe. A sampling distribution – a normal distribution whose mean and standard deviation are unbiased estimates of the parameters and allows one to infer the parameters from the statistics. Confidence Limits Confidence interval (interval estimate) – A range of values defined by the confidence level within which the population parameter is estimated to fall. Confidence Level – The likelihood, expressed as a percentage or a probability, that a specified interval will contain the population parameter. Confidence Levels: • Confidence Level – The likelihood, expressed as a percentage or a probability, that a specified interval will contain the population parameter. • 95% confidence level – there is a .95 probability that a specified interval DOES contain the population mean. In other words, there are 5 chances out of 100 (or 1 chance out of 20) that the interval DOES NOT contain the population mean. • 99% confidence level – there is 1 chance out of 100 that the interval DOES NOT contain the population mean. Confidence Levels: Constructing a Confidence Interval (CI) • The sample mean is the point estimate of the population mean. • The sample standard deviation is the point estimate of the population standard deviation. • The standard error of the mean makes it possible to state the probability that an interval around the point estimate contains the actual population mean. Go Back to Meaning of Standard Error The standard error of the mean can be interpreted as: If we were to take another sample from the population and computed its mean, there is a 68% chance that the mean of the sample would lie within 1 standard error of the population mean. Distribution of means 𝑥 We can do better than that If we were to take another sample from the population and computed its mean, there is a x% chance that the mean of the sample would lie within y standard errors of the population mean. 2.5% Distribution of means 𝑥 2.5% We can do better than that If we were to take another sample from the population and computed its mean, there is a x% chance that the mean of the sample would lie within y standard errors of the population mean. 2.5% Distribution of means 𝑥 2.5% Level of Confidence 1. 2. Probability that the unknown population parameter falls within interval Denoted (1 - a) % a Is the probability that the parameter is Not within the interval 3. Typical values for (1- a)% are 99%, 95%, 90% Consider a 95% confidence interval: 1 a 0.95 a 0.05 0.475 0.475 α 0.025 2 Z= -1.96 Lower Confidence Limit μL a / 2 0.025 0 Point Estimate α 0.025 2 Z= 1.96 Z Upper Confidence Limit μU μ Process for Constructing Confidence Intervals • Compute the sample statistic (e.g. a mean) • Compute the standard error of the mean • Make a decision about level of confidence that is desired (usually 95% or 99%) • Find tabled value for 95% or 99% confidence interval • Multiply standard error of the mean by the tabled value • Form interval by adding and subtracting calculated value to and from the mean Interpretation A 95% confidence interval means that if one were to take another sample from the population, then 95% of the time, the mean of the sample would lie between the confidence intervals. Example: Estimation of the mean The mean of a random sample of n = 25 is`X = 50. Set up a 95% confidence interval estimate for mX if the population sX = 10. Exercise: The mean birth weights for 200 babies is 3.28 Kg grams with a population standard deviation of 0.85 Kg. Compute the 95% confidence limits for the mean birthweight. Exercise: The mean birth weights for 200 babies is 3.28 Kg grams with a population standard deviation of 0.85 Kg. Compute the 95% confidence limits for the mean birthweight. 𝜇 = 3.28 ± 1.96 0.85 200 = 3.28 ± 0.12 Exercise: The mean concentration for a sample of 100 insulin vials is 15 grams/vial with a population standard deviation of 3.4 grams. Compute the 90% confidence limits for the mean concentration of insulin. Note 90% means 5% on each side, hence look up 0.95 in the z table. Exercise: The mean concentration for a sample of 100 insulin vials is 15 grams/vial with a population standard deviation of 3.4 grams. Compute the 90% confidence limits for the mean concentration of insulin. Note 90% means 5% on each side, hence look up 0.95 in the z table. 𝜇 = 15 ± 1.65 3.4 100 = 15 ± 0.56 There is One Little Problem It doesn’t work very well. Reread the question…… “The mean birth weights for 200 babies is 3.28 Kg grams with a population standard deviation of 0.85 Kg. Compute the 95% confidence limits for the mean birthweight. “ Can you spot the problem which makes the question practically unanswerable. ? There is One Little Problem It doesn’t work very well. Reread the question…… “The mean birth weights for 200 babies is 3.28 Kg grams with a population standard deviation of 0.85 Kg. Compute the 95% confidence limits for the mean birthweight. “ Can you spot the problem which makes the question practically unanswerable ? We don’t actually have the population standard deviation, no one has the weights of every baby in the world. Instead what we actually have is more likely the sample standard deviation. By using the sample standard deviation we introduce some error and the intervals are under estimated – to fix this we have to use a different distribution Standardizing 𝒙 Just as we can standardize a normal distribution we can also standardize the distribution of means: 95% of the z values will fall between -1.96 and +1.96. However it assumes we know 𝜎 Because we don’t know 𝜎 we will often substitute it for the sample standard deviation, s. The problem is that is no longer normally distributed. This is particularly a problem when n < 30 31 Standardizing 𝒙 The distribution of was discovered by Gossett under the pen name student And since then it has been called the student’s t distribution. The shape of the t-distribution depends on n. 32 Student’s t-distribution When the population variance is unknown and the sample is random, the distribution that correctly describes the sample mean is known as the tdistribution. • The t-distribution has larger reliability (cutoff) values for a given level of alpha than the normal distribution, but as the sample size increases, the cutoff values approach those of the normal distribution. For small sample sizes, use of the t-distribution instead of the z-distribution to determine reliability factors is critical. • The t-distribution is a symmetrical distribution whose probability density function is defined by a single parameter known as the degrees of freedom (df). 33 Student’s t-distribution • t distribution is symmetrical around its mean of zero, like Z dist. • Compare to Z dist., a larger portion of the probability areas are in the tails. • As n increases, the t dist. approached the Z dist. • t values depends on the degree of freedom. 34 Degrees of freedom The parameter that completely characterizes a t-distribution. • The degrees of freedom for a given t-distribution are equal to the sample size minus 1. For a sample size of 45, the degrees of freedom are 44. 35 Confidence Limits for Small Samples where t is the critical value of the t distribution with n-1 d.f. and an area of α/2 in each tail) Example: 6 random vials of penicillin were selected and the concentration of penicillin was determined in each vial in mg/ml 8.6, 9.7, 13.4, 11.4, 10.2, 12.3 Find the 95% confidence limits for the true mean concentration of penicillin. Confidence Limits for Small Samples 8.6, 9.7, 13.4, 11.4, 10.2, 12.3 Find the 95% confidence limits for the true mean concentration of penicillin. Since we are dealing with less than a sample size of 30 we will use the t-statistic to determine the confidence limits. n=6 Mean = 10.93 Sample standard deviation = 1.77 Confidence Limits for Small Samples n=6 Mean = 10.93 Sample standard deviation = 1.77 df = 6 – 1 = 5 Confidence level = 95% 𝑡5,95% = 2.571 Therefore 𝜇 = 10.93 ± 2.571 1.77 √6 = 10.93 ±1.86 Compare the normal Z statistic at 95% of 1.96 End Estimation Question: 1. How many parameters does it require to describe a binomial distribution? 2. How many parameters does it require to describe a normal distribution? Estimation Question: 1. How many parameters does it require to describe a binomial distribution? Number of trials n, and probability of success in a single trial, p 2. How many parameters does it require to describe a normal distribution? Mean and standard deviation