Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 9.1: Sampling Distributions ACTIVITY 9 Young Women’s Heights 1. Use N(64.5,2.5) as the distribution. 2. On your calculator clear L1/list 1. 3. Simulate the heights of 100 randomly selected women and store the heights in L1/list 1 4. On TI-83/84 Press MATH; choose PRB; 6:randNorm (μ,σ, n) 5. Complete the command: randNorm(64.5,2.5,100) ENTER 6. Plot a histogram of the 100 heights 7. 8. 9. 10. Clear any functions in Y= Turn off STAT PLOTS Set Window to X[57,72]2.5 and Y[-10,45]5 Define PLOT 1 to be a Histogram using the heights in list 1. 11. GRAPH Is your histogram fairly symmetric or clearly skewed? Approximately how many heights should be within 3σ of the mean? i.e. 64.5 3(2.5) What are these values? Use TRACE to count the number of heights within 3σ. How many heights should there be within 1σ of the mean? Within 2σ of the mean? Find these counts Standard deviation 1σ 2σ 3σ Counts Use 1-Var Stats to find the mean, median and standard deviation of your data. Compare your mean to the population mean μ = 64.5. Compare the standard deviation of your data to the standard deviation of the population σ= 2.5. How do the mean and median for your 100 heights compare? What is true about a distribution the closer the mean is to the median? Define PLOT 2 to be a boxplot using list 1. Graph it. The boxplot should be plotted above the histogram. Would you say the distribution is nonsymmetric, moderately symmetric, or very symmetric? POST-IT HISTOGRAM OF CLASS DATA SUMMARY: What is the approximate shape of the distribution of the xbars? Where is the center of the distribution of xbar? How does this center compare with the mean of heights of the population of all young women? How does the spread of the distribution of xbar compare with the original distribution (σ = 2.5)? Enter the xbar values in list 2. Turn off PLOT 1. Define PLOT 3 to be a boxplot of the xbar data. How do these distributions of X and xbar compare visually? Use the 1-Var STAT to calculate the standard deviations of the xbars. Compare this number with the σ/100. Fill in the blanks with with appropriate function of μ or σ. “The distribution of xbar is approximately normal with mean μ(xbar)=_______and standard deviation σ(xbar) = ________. Fill in the blanks with with appropriate function of μ or σ. “The distribution of xbar is approximately normal with mean μ(xbar)= μ and standard deviation n σ(xbar) = _ σ /_______. In this activity, xbar is a sample statistic while μ is the population parameter. • Sample Statistic is a number that describes a sample. The value is known but can change from sample to sample. • Population parameter is a fixed number that describes the population but we do not know it because we cannot examine the entire population. • Sampling Distribution: of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population. ( like the distribution of all of the means from each of you in the previous activity) • Bias: The sampling distribution allows us to describe bias using methods other than describing the sampling method. • Bias concerns the center of the sampling distribution. • A statistic used to estimate a parameter is unbiased if the mean of sampling distribution is equal to the true value of the parameter being estimated. • Variability of a statistic is described by the spread of its sampling distribution. Larger samples give smaller spread as long as the population is much larger than the sample, the spread of the sampling distribution is the same for ANY population. Proportion sample of viewers who watched Survivor II in samples of n= 100 Sampling distribution of the sample proportion from SRSs’ of size 1000 Sampling distribution for samples of size 1000 redrawn with expanded scale to better display the shape. The approximate sampling distributions for sample proportions for SRSs of 100 with population p=.37 The approximate sampling distributions for sample proportions for SRSs of 1000 with population p=.37 Both statistics are unbiased because the means of their distributions equal the true population value p =0.37. The statistic from the larger sample is less variable. The desired balance is low bias, low variability. Describe these sampling distributions. high bias, high variability low bias, low variability low bias, high variability high bias, low variability • Do problems 9.1, 9.2, 9.3 and 9.4 and 9.5 Chapter 9.2: Sample Proportions • The objective of some statistical applications is to reach a conclusion about a population proportion, p. For example, we may try to estimate an approval rating through a survey, or test a claim about a proportion of defective light bulbs in a shipment based on a random sample. Since p is unknown to us, we must base our conclusion on a sample proportion. However, the value of p-hat will vary from sample to sample. The amount of variability depends on the size of the sample. The Sampling Distribution for Proportions: If we take repeated random samples of size n from a population, the sample proportion, p, will have the following distribution and properties: (1) p (hat) is unbiased estimate of p since μ p= p p(1 p) (2) sampling variability σ p= n (3) shape of p(hat) approximates a normal distribution np and n(1-p) ≥ 10 Census example Based on Census data, we know 11% of US adults are Black. Therefore, p = .11. We would expect a sample to contain roughly 11% Black representation. Suppose a sample of 1500 adults contain 138 Black individuals. Should we suspect “undercoverage” in the sampling method? Note: p(hat) = 138/1500= .092. Is this lower than what would be expected by chance? Census example If we expect to get a certain sample result e.g. .11 and we don’t get it, this could be due to sampling variability. ( remember in repeated samples from the same population, we will have different sample results). It could also be caused from sampling from a different population than we thought. If the result is far from the expected value then we think that something other than chance is operating and the result is statistically significant. Census example • We know it is possible for a sample to contain 9.2% Black representation…but is it likely that would happen due to natural variation in random sampling methods? • (1) Check assumptions: Is np > 10? Is n(1-p)> 10? • (2) Assume the Sampling distribution of p(hat) is approximately Normal. Census example • (3) Calculate the Probability P (p(hat) ≤ .092) = P(z ≤ -2.223) = 0.0129 • (4) Interpret results…what does it mean? Only 1.29% of the samples of size 1500 would have less than 9.2% Black representation. Since this is very unlikely, we have reason to suspect possible undercoverage in this sample. The normal approximation to the sampling distribution of p(hat) Do problems 9.12, 9.13 and 9.29 in the book 9.3 Sample Means When the objective of a statistical application is to reach a conclusion about a population mean, μ, we must consider the sample mean, xbar. However, as we have noted several times, the value of xbar will vary from sample to sample. The amount of variability will depend on the size of our sample. The Sampling Distribution for means: If we take repeated random samples of size n from a population, the sample proportion, xbar, will have the following distribution and properties: (1) The set of all sample means is unbiased, approximately normal (2) The mean of the set of sample means is equal to mean μ of the population. (3) The standard deviation of xbar is approximately equal to the n (4) xbar is less spread out as standard deviation decreases by n The Sampling Distribution for means: • 5) averages are less variable than individual observations • 6) averages are more normal than individual observations. Are you smarter than a 5th grader? • Let x represent the time it takes a 5th grader to complete a math problem. Suppose the mean and standard deviation are μ= 2 min. and std dev σ=.8 min. respectively. • Let xbar be the sample average time for 9 students. Describe the sampling distribution of xbar. Are you smarter than a 5th grader? • Let x represent the time it takes a 5th grader to complete a math problem. Suppose the mean and standard deviation are μ= 2 min. and std dev σ=.8 min. respectively.9 • Let xbar be the sample average time for 9 students. Describe the sampling distribution of xbar. xbar = normal distribution with mean 2 min and std dev of .8/√9 Are you smarter than a 5th grader? • Suppose we take a SRS of 20 students. Describe the sampling distribution of xbar. Use it to find the probability that xbar is greater than 2.5 min for the sample of 20 students. Are you smarter than a 5th grader? • Suppose we take a SRS of 20 students. Describe the sampling distribution of xbar. Use it to find the probability that xbar is greater than 2.5 min for the sample of 20 students. • xbar = normal with mean 2 minutes • std dev = .8/√20 = .1788 • p(x >2.5) =z > 2.5-2 /(.1788)= 2.7964 • p = 1-.9974 = .0026 very small The Central Limit Theorem • As the sample size increases, the distribution gets closer and closer the a normal distribution. This is true no matter what shape the population distribution has, as long as the population has a finite standard deviation σ. More observations are required if the shape is far from normal. • When n is large, the sampling distribution of the sample mean xbar is close to the normal distribution N(μ,σ/√n) with mean μ and standard deviation σ/√n. To demonstrate the Central Limit Theorem • Activity:A Penny for your Thoughts