Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling Distributions OBJECTIVE Set Up a Sampling Distribution, CLT, & Applications RELEVANCE To see how sampling can be used to predict population values. • The U.S. census is only done once every 10 years because it is impractical to do it often. • Therefore, the sample becomes very important. The Sampling Issue…… • The goal of the survey is to get the same results that would be obtained if all had answered from the entire population. • It is important that every member of the population has an equal chance of being chosen. Activity…… • Get in groups of 4. • Find the mean height of your group. • Now find the mean height of the class. • Is the class mean the same as your group’s mean? Sample Mean vs. Population Mean…… • Not every sample mean will be the same as the population mean, but if you take good samples the means will be very close. x sample mean population mean x Mean of Sample Means What is a Sampling Distribution? Sampling Distribution – the distribution of values for a sample obtained from repeated samples, all of the same size and all drawn from the same population. Example…… • Consider the following set: {0,2,4,6,8}. a. Make a list of all possible samples of size 2 that can be drawn from this set. b. Construct a sampling distribution of the sample means for samples of size c. Graph the histogram of the population and sampling distribution. What do you notice? a. {0,2,4,6,8} • (0,0) (0,2) (0,4) (0,6) (0,8) (2,0) (2,2) (2,4) (2,6) (2,8) (4,0) (4,2) (4,4) (4,6) (4,8) Sets of 2 (6,0) (6,2) (6,4) (6,6) (6,8) (8,0) (8,2) (8,4) (8,6) (8,8) b. 1st find the means for each sample…… • (0,0) (0,2) (0,4) (0,6) (0,8) 0 1 2 3 4 (2,0) (2,2) (2,4) (2,6) (2,8) 1 2 3 4 5 (4,0) (4,2) (4,4) (4,6) (4,8) 2 3 4 5 6 (6,0) 3 (8,0) 4 (6,2) 4 (8,2) 5 (6,4) 5 (8,4) 6 (6,6) 6 (8,6) 7 (6,8) 7 (8,8) 8 Sample Space • Notice that each of these sample means is equally likely to occur. • Therefore, the probability of each is 1/25 = 0.04. The sampling distribution of the sample means (SDSM) x P(x) 0 1 2 3 1/25 2/25 3/25 4/25 4 5 6 5/25 4/25 3/25 7 8 2/25 1/25 Notice it is NORMAL! Example – You Try…… • Let’s say I picked out all the grades for the last quiz that were either 57, 67, 77, 87, or 97 and put them in a pile. Find every possible combination of quiz grades I could get if I picked 2 quizzes from this pile. • NOTE: There will be 25 possible combinations. Now lets find the mean for each pair (57, 57) (67, 57) (77, 57) (87, 57) (97, 57) (57, 67) (67, 67) (77, 67) (87, 67) (97, 67) (57, 77) (67, 77) (77, 77) (87, 77) (97, 77) (57, 87) (67, 87) (77, 87) (87, 87) (97, 87) (57, 97) (67, 97) (77, 97) (87, 97) (97, 97) There are 25 possible combinations (57, 57) 57 (57, 67) 62 (57, 77) 67 (57, 87) 72 (57, 97) 77 (67, 57) 62 (67, 67) 67 (67, 77) 72 (67, 87) 77 (67, 97) 82 (77, 57) 67 (77, 67) 72 (77, 77) 77 (77, 87) 82 (77, 97) 87 (87, 57) 72 (87, 67) 77 (87, 77) 82 (87, 87) 87 (87, 97) 92 (97, 57) 77 (97, 67) 82 (97, 77) 87 (97, 87) 92 (97, 97) 97 • Each has a probability of 1/25 chance of selection. • Let’s make a chart. Chart and Graph x P(x) Probability Distribution of Means 1/25 = 0.04 62 2/25 = 0.08 0.25 67 3/25 = 0.12 0.2 72 4/25 = 0.16 77 5/25 = 0.20 82 4/25 = 0.16 87 3/25 = 0.12 92 2/25 = 0.08 97 1/25 = 0.04 Probability 57 0.15 0.1 0.05 0 57 62 67 72 77 82 Quiz Pair Means 87 92 97 Sampling Distribution of Sample Means SDSM SDSM…… • If all possible random samples, each of size n, are taken from any population with mean and st. deviation , then the SDSM will: 1. Have a sampling distribution mean equal to the population mean. x 2. Have a sampling distribution standard deviation equal to the population st. dev. divided by the square root of the sample size. x n The shape of the distribution…… • If the population • If the population is has a normal NOT a normal distribution, then distribution, then the sampling we use the distribution of the Central Limit sample means will Theorem to make also be normal. the sampling distribution approximately normal. The CLT…… • Definition – The SDSM will more closely resemble the normal distribution as the sample size increases. • The CLT can be used to answer questions about sample means in the same manner that the normal distribution can be used to answer questions about individual values. • **The CLT is used when the sampled population is NOT normal. The sampling distribution will be approximately normal under the right conditions. The Standard Error of the Mean…… • The symbol used to represent the standard deviation of the samples, also known as the standard error of the mean, is x The SDSM follows these rules……. x 1. The 2. The x n This measures the spread. (Note: “n” is the size of each sample) 3. a. A normal parent population produces a normal sampling distribution. b. Use the CLT when the sample size is large enough to make a sampling distribution normal when the parent population is NOT normal. Let’s show how this works using an example….. • Consider all possibilities of sample size 2 of {2,4,6}. Find the probability distribution of the population with the histogram and then find the sampling distribution of the sample means and draw the histogram. Probability Distribution of Parent & Histogram…… x P(x) [ x P( x)] 4 [x 2 P( x)] 1.63 2 1/3 4 1/3 6 1/3 • Now, let’s do a sampling distribution of sets of 2 from this population we just described. The sets of 2 and their means…… • (2,2) 2 (4,2) 3 (6,2) 4 (2,4) 3 (4,4) 4 (6,4) 5 (2,6) 4 (4,6) 5 (6,6) 6 Sampling Distribution…… x P(x) 2 1/9 3 2/9 4 3/9 5 2/9 6 1/9 • Find the mean of the sampling distribution: x 4 Note : x • Find the st. dev. of the sampling dist: x 1.15 1.63 x 1.15 n 2 The Histogram…… • Now, take a look at the shape of the histogram of the sampling distribution. It is approximately normal. Properties of SDSM – Center, Shape, Spread x x The shape of the distributi on of the SDSM was approx. normal. n Sample Question • A certain population has a mean of 437 and a standard deviation of 63. Many samples of size 49 are randomly selected and the means are calculated. • A. What value would you expect to find for the mean of all these samples? 437 • B. What value would you expect to find for the st. 63 deviation of all these samples? 9 49 • C. What shape would you expect the distribution of all these sample means to have? A p p ro x. N o rm a l SDSM Applications Remember…… • Use “ncdf (z, z)” to find area or probability under the curve. • Change all “real” values to zscores if the mean is not 0. • Population Mean = Sample Mean x • St. Error of the Mean: x n Why is Sample Size Important? • If x n • Answer: As the sample size increases, the What happens as the standard deviation sample size increases? of the sample Larger sample sizedecreases. This smaller variation means that the variation is decreasing. Remember, less Smaller sample size- variation is better. larger variation Example……Follow the steps • A normal population has a population mean of 100 and a population st. deviation of 20. If a sample of size 16 is selected, what is the probability that this sample will have a mean value between 90 and 110? • Draw the normal distribution curve and shade it. • You need to change 90 and 110 to zscores. • Then use normalcdf (z, z) to find the probability. • The z-score formula will be a little bit different now because the st. deviation of the population must be changed to a sample st. deviation. You now use z x x x x x n • Let’s change the mean values of 90 and 110. 90 100 10 z90 2 5 20 16 z110 110 100 10 2 5 20 16 • Now use normalcdf from where you started shading to where you stopped shading: • ncdf(-2,2) = 0.9545 Example……You Try • Kindergarten children have heights that are approximately normally distributed with a population mean of 39 inches and a population standard deviation of 2 inches. A sample of 25 is taken. What is the probability that this sample will have a mean value between 38.5 inches and 40 inches? Answer…… z 38.5 38.5 39 2 25 0.5 1.25 2 5 40 39 1 z 40 2.5 2 2 25 5 ncdf (1.25,2.5) 0.8881 Cutoff Example • If the population mean of a distribution is 39 and the population st. deviation is 2, within what limits does the middle 90% fall for a sample of 100? • Hint: This is a cutoff score in the middle. First, you find the z-scores. Next, you substitute them back into the z-score formula. Answer…… • Find the z-score for the middle 90%: z = InvNorm(.5 .90/2) z = + - 1.64 • Now, plug these into the formula with the new standard deviation for a sample. x 39 1.64 2 100 x 38.67 x 39.33