Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 7 1.18.2017 Activity • You will receive 6 random cards – Randomly choose 5 of the six – Calculate the median of your 5 cards – Put it on the dotplot • Then trade 2 (randomly chosen) of your cards to another person for 2 of theirs – Then trade 2 (randomly chosen) again with a different person • Now choose 5 cards again, and calculate the median again • Trade again – So you will put 3 total dots on the board Parameters vs Statistics • If we want to estimate something about a population, we often take a sample • Think of political polls—we take a sample of 500 or 1000 people to try to estimate how the entire country is feeling • The percentage of people in the sample who support a certain candidate is a statistic • The percentage of people in the population that support a certain candidate is a parameter An Example • I take a random sample of 10 AP Statistics students • I measure their final exam scores. Their average was a 58.1% – I conclude that the average final exam score for all AP Statistics students was probably around a 58% • • • • The 58.1% for the 10 students is _________ The 58% estimated score is ____________ What is the population? What is the sample? Sampling Variability • I take a different random sample of 10 students – The mean is 60.05 • If I took another different sample, it would probably be different from both of the first 2 – This is the idea of sampling variability – 2 different random samples are likely to produce somewhat different statistics Sampling Variability • If we take a bunch of different random samples, the distribution of the estimates (or “statistics”) will have a center very close to the true value • Look at the cards: True median is 6 Describing a Sampling Distribution • • • • Shape: roughly symmetric with a peak at 6 Center: mean/median of approximately 6 Spread: Standard deviation of roughly 1.8 Outliers: No outliers Population Distribution • This would be the distribution if we had the entire population – We never do • But with enough samples, the sampling distribution closely approximates the population distribution Unbiased Estimator • A statistic used to estimate a parameter is unbiased if the mean of the sampling distribution is equal to (or very close to) the true value of the parameter being estimated • If they are different, then there is bias – Why could there be bias in a statistic? Unbiased Estimator • A statistic used to estimate a parameter is unbiased if the mean of the sampling distribution is equal to (or very close to) the true value of the parameter being estimated • If they are different, then there is bias – Why could there be bias in a statistic? • Bad sampling is the most common reason—not representative of the population Variability • Which gives us more confidence in our conclusions? – The mean of your statistic is 54.3 with a standard deviation of 29.5 – The mean of your statistic is 54.3 with a standard deviation of 8.4 Variability • Which gives us more confidence in our conclusions? – The mean of your statistic is 54.3 with a standard deviation of 29.5 – The mean of your statistic is 54.3 with a standard deviation of 8.4 • A “tighter” distribution gives us more confidence that the mean truly is near 54.3 • We would say that there is less variability (which is good) Bias & Variability • Which one has low bias with high variability? • High bias, low variability? • Low bias, low variability? • High bias, high variability? Bias & Variability • Which one is best? • Which one is worst? Bias & Variability • Which one is best? – Low bias, low variability • Which one is worst? – High bias, high variability • Debates about 2nd best – I lean towards low bias, high variability Question #1 on your Homework • A random sample of 1000 people who signed a card saying that they intended to quit smoking were contacted 9 months later. 210 of the 1000 (21%) of the sampled individuals had not smoked over the past six months – Population: – Parameter: – Sample: – Statistic Question #1 on your Homework • A random sample of 1000 people who signed a card saying that they intended to quit smoking were contacted 9 months later. 210 of the 1000 (21%) of the sampled individuals had not smoked over the past six months – Population: People who signed a card saying that they would quit smoking – Parameter: Proportion who actually quit (or proportion who didn’t actually quit) – Sample: Random sample of 1000 people who signed the card – Statistic: 0.21