Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Section 10.1 Estimating with Confidence AP Statistics January 2013 1 np 10 or nq 10? Use Binomial distribution tools. Sample Proportions? Make sure the population size 10n pq n so you may use pˆ np 10 and nq 10? Use Normal distribution tools. Is the population distribution normal? Use Normal distribution tools. Sample Means? Make sure the population size 10n so you may use x n Is the shape of population distribution unknown or distinctly nonnormal? If n 30, the Central Limit Theorem applies so you may use Normal distribution tools. 2 Otherwise, you need other tools. An introduction to statistical inference Statistical Inference provides methods for drawing conclusions about a population from sample data. In other words, from looking a sample, how much can we “infer” about the population. We may only make inferences about the population if our samples unbiased. This happens when we get our data from SRS or well-designed experiments. 3 Example A SRS of 500 California high school seniors finds their mean on the SAT Math is 461. The standard deviation of all California high school seniors on this test 111. What can you say about the mean of all California high school seniors on this exam? 4 Example (What we know) Data comes from SRS, therefore is unbiased. There are approximately 350,000 California high school seniors. 350,000>10*500. We can estimate 111 4.5 sigma-x-bar as n 500 The sample mean 461 is one value in the distribution of sample means. x 5 Example (What we know) The mean of the distribution of sample means is the same as the population mean. Because the n>30, the distribution of sample means is approximately normal. (Central Limit Theorem) 6 Our sample is just one value in a distribution with unknown mean… 7 Confidence Interval A level C confidence interval for a parameter has two parts. An interval calculated from the data, usually in the form (estimate plus or minus margin of error) A confidence level C, which gives the long term proportion that the interval will capture the true parameter value in repeated samples. 8 9 Conditions for Confidence Intervals the data come from an SRS or well designed experiment from the population of interest the sample distribution is approximately normal 10 11 Confidence Interval Formulas CI x z CI x z * * n ,x z * n n * where z is the upper p critical value 12 Using the z table… Confidence level Tail Area z* 90% .05 1.645 95% .025 1.960 99% .005 2.576 13 Four Step Process (Inference Toolbox) Step 1 (Pop and para) Define the population and parameter you are investigating Step 2 (Conditions) Do we have biased data? Do we have independent sampling? If SRS, we’re good. Otherwise PWC (proceed with caution) If pop>10n, we’re good. Otherwise PWC. Do we have a normal distribution? If pop is normal or n>30, we’re good. Otherwise, PWC. 14 Four Step Process (Inference Toolbox) Step 3 (Calculations) Find z* based on your confidence level. If you are not given a confidence level, use 95% Calculate CI. Step 4 (Interpretation) “With ___% confidence, we believe that the true mean is between (lower, upper)” 15 Confidence interval behavior To make the margin of error smaller… make z* smaller, which means you have lower confidence make n bigger, which will cost more margin of error z * n 16 Confidence interval behavior If you know a particular confidence level and ME, you can solve for your sample size. margin of error z * n 17 Example Company management wants a report screen tensions which have standard deviation of 43 mV. They would like to know how big the sample has to be to be within 5 mV with 95% confidence? You need a sample size of at least 285. ME z * n 43 5 1.96 n 43 n 1.96 5 2 43 n 1.96 284.12 5 18 Mantras “Interpret 80% confidence interval of (454,467)” With 80% confidence we believe that the true mean of California senior SAT-M scores is between 454 and 467. “Interpret 80% confidence” If we use these methods repeatly, 80% of the time our confidence interval captures the true mean. Probability 19 Assignment Exercises 10.1 to 10.8 20