Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Foundations of statistics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Misuse of statistics wikipedia , lookup
Confidence Interval for a Mean Confidence Intervals _ Sample Size 30 • • • 30 Sigma x z 2 n known sknown ( unknown) Topics: • Essentials • Inferential Statistics • Terminology • Margin of Error • Z/2 • Decision Grid • Examples _ X %C.I . ( x E x E ) Large Sample Small Sample – Student’s t Distribution Proportion x t 2 , df x z 2 n s n s x z 2 n Confidence Interval for a Proportion X %C.I . ( p E p p E ) p z 2 pq n Essentials: Confidence Intervals (How sure we are.) Inferential statistics, precision and the margin of error. Obtaining a confidence interval. Z/2 Guinness, Gosset & the Student’s t Distribution Confidence Intervals for large and small samples, and proportions. Inferential Statistics: INFERENTIAL STATISTICS: Uses sample data to make estimates, decisions, predictions, or other generalizations about the population. The aim of inferential statistics is to make an inference about a population, based on a sample (as opposed to a census), AND to provide a measure of precision for the method used to make the inference. An inferential statement uses data from a sample and applies it to a population. Some Terminology Estimation – is the process of estimating the value of a parameter from information obtained from a sample. Estimators – sample measures (statistics) that are used to estimate population measures (parameters). Terminology (cont’d.) Point Estimate – is a specific numerical value estimate of a parameter. Interval Estimate – of a parameter is an interval or range of values used to estimate the parameter. It may or may not contain the actual value of the parameter being estimated. Terminology (cont’d.) Confidence Level – of an interval estimate of a parameter is the probability that the interval will contain the parameter. Confidence Interval – is a specific interval estimate of a parameter determined by using data obtained from a sample and by using a specific confidence level. Margin of Error, E z 2 n The term is called the maximum error of estimate or margin of error. It is the maximum likely difference between the point estimate of a parameter and the actual value of the parameter. It is represented by a capital E. E z 2 n Z/2 : Areas in the Tails Obtaining :Convert the Confidence Level to a decimal, e.g. 95% = .95. Then: 1 C.L. / 100 1 95 / 100 1 .95 .05 .025 2 2 95% .025 .025 z 2 n -z (here -1.96) z 2 n z (here 1.96) 2 Decision Grid Confidence Interval for a Mean _ _ X %C.I . ( x E x E ) Sample Size Sigma known sknown ( unknown) 30 x z 2 n x t s , df 2 n 30 x z 2 n s x z 2 n Confidence Interval for a Proportion X %C.I . ( p E p p E ) p z 2 pq n t or z???? Is Known? yes no Is n greater than or equal to 30? yes Use z-interval formula values no matter what the sample size is.* Use z-interval formula values and replace in the formula with s (sample std. dev.). no Use t-values and s in the formula.** *Variable must be normally distributed when n<30. **Variable must be approximately normally distributed. Situation #1: Large Samples or Normally Distributed Small Samples A population mean is unknown to us, and we wish to estimate it. Sample size is > 30, and the population standard deviation is known or unknown. OR sample size is < 30, the population standard deviation is known, and the population is normally distributed. The sample is a simple random sample. Confidence Interval for (Situation #1) 1 x z 2 n s x z 2 n Consider The mean paid attendance for a sample of 30 Major League All Star games was $46,970.87, with a standard deviation of $14,358.21. Find a 95% confidence interval for the mean paid attendance at all Major League All Star games. 95% Confidence Interval for the Mean Paid Attendance at the Major League All Star Games $14,358.21 $46,970.87 1.96 30 $46,970.87 $5,138.02 ($41,832.85 $52,108.89) Minimum Sample Size Needed For an interval estimate of the population mean is given by z n 2 E 2 Where E is the margin of error (maximum error of estimate) Situation #2: Small Samples A population mean is unknown to us, and we wish to estimate it. Sample size is < 30, and the population standard deviation is unknown. The variable is normally or approximately normally distributed. The sample is a simple random sample. Student t Distribution Is bell-shaped. Is symmetric about the mean. The mean, median, and mode are equal to 0 and are located at the center of the distribution. Curve never touches the x-axis. Variance is greater than 1. As sample size increases, the t distribution approaches the standard normal distribution. Has n-1 degrees of freedom. Student t Distributions for n = 3 and n = 12 Student t Standard normal distribution distribution with n = 12 Student t distribution with n = 3 0 Confidence Interval for (Situation #2) A 1 confidence interval for given by s x t ,n1 2 n is Consider The mean salary of a sample of n=12 commercial airline pilots is $97,334, with a standard deviation of $17,747. Find a 90% confidence interval for the mean salary of all commercial airline pilots. 90% Confidence Interval for the Mean Salary of Commercial Airline Pilots $17,747 $97,334 1.796 12 $97,334 $9,201.12 ($88,132.88 $106,535.12) Situation #3: Confidence Interval for a Proportion A confidence interval for a population proportion p, is given by pˆ z 2 pˆ qˆ n Where p̂ is the sample proportion . qˆ 1 pˆ n = sample size np and nq must both be greater than or equal to 5. Consider In a recent survey of 150 households, 54 had central air conditioning. Find the 90% confidence interval for the true proportion of households that have central air conditioning. Here pˆ 54 .36 150 qˆ 1 pˆ 1 .36 .64 n 150 (NOTE both np and nq > 5) pˆ z 2 pˆ qˆ n (.36)(. 64) .36 1.645 150 90% C.I. = .36 ± .065 or 90% C.I. = (.295, .425) or 90% C.I. = (.295 < p < .425) We can be 90% confident that the true proportion, p, of all homes having central air conditioning is between 29.6% and 42.5% Minimum Sample Size Needed For an interval estimate of a population proportion is given by 2 z n pˆ qˆ 2 E Where E is the maximum error of estimate (margin of error) End of slides