Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling Distribution WELCOME to INFERENTIAL STATISTICS A Sampling Distribution We are moving from descriptive statistics to inferential statistics. Inferential statistics allow the researcher to come to conclusions about a population on the basis of descriptive statistics about a sample. For example: A Sampling Distribution Your sample says that a candidate gets support from 47%. Inferential statistics allow you to say that the candidate gets support from 47% of the population with a margin of error of +/- 4%. This means that the support in the population is likely somewhere between 43% and 51%. A Sampling Distribution Margin of error is taken directly from a sampling distribution. 95% of Possible Sample Means It looks like this: 43% 47% 51% Your Sample Mean A Sampling Distribution Let’s create a sampling distribution of means… Take a sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K. $30K A Sampling Distribution Let’s create a sampling distribution of means… Take another sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K. $30K A Sampling Distribution Let’s create a sampling distribution of means… Take another sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K. $30K A Sampling Distribution Let’s create a sampling distribution of means… Take another sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K. $30K A Sampling Distribution Let’s create a sampling distribution of means… Take another sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K. $30K A Sampling Distribution Let’s create a sampling distribution of means… Take another sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K. $30K A Sampling Distribution Let’s create a sampling distribution of means… Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes. Our census said the mean is $30K. $30K A Sampling Distribution Let’s create a sampling distribution of means… Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes. Our census said the mean is $30K. $30K A Sampling Distribution Let’s create a sampling distribution of means… Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes. Our census said the mean is $30K. $30K A Sampling Distribution Let’s create a sampling distribution of means… Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes. Our census said the mean is $30K. The sample means would stack up in a normal curve. A normal sampling distribution. $30K A Sampling Distribution Say that the standard deviation of this distribution is $10K. Think back to the empirical rule. What are the odds you would get a sample mean that is more than $20K off. The sample means would stack up in a normal curve. A normal sampling distribution. $30K -3z -2z -1z 0z 1z 2z 3z A Sampling Distribution Say that the standard deviation of this distribution is $10K. Think back to the empirical rule. What are the odds you would get a sample mean that is more than $20K off. The sample means would stack up in a normal curve. A normal sampling distribution. 2.5% 2.5% $30K -3z -2z -1z 0z 1z 2z 3z Central Limit Theorem (CLT) Central Limit Theorem: As sample size increases, the sampling distribution of sample means approaches that of a normal distribution with a mean the same as the population and a standard deviation equal to the standard deviation of the population divided by the square root of n (the sample size). N(ℳ , σ/√n) with mean ℳ and sd σ/√n Variability in Sampling Distribution For example, if the variability is low, , we can trust our number more than if the variability is high, . • An Example: • A population’s car values are = $12K with = $4K. • Which sampling distribution is for sample size 625 and which is for 2500? What are their s.e.’s (standard error)? 95% of M’s -3 -2 ? $12K -1 0 ? 1 2 95% of M’s 3 ? $12K ? -3-2-1 0 1 2 3 • An Example: • A population’s car values are = $12K with = $4K. • Which sampling distribution is for sample size 625 and which is for 2500? What are their s.e.’s? • (2500 = 50) (625 = 25) s.e. = $4K/50 = $80 s.e. = $4K/25 = $160 95% of M’s -3 -2 ? $12K -1 0 ? 1 2 95% of M’s 3 ? $12K ? -3-2-1 0 1 2 3 Which sample will be more precise? If you get a particularly bad sample, which sample size will help you be sure that you are closer to the true mean? 95% of M’s -3 -2 ? $12K -1 0 ? 1 2 95% of M’s 3 ? $12K ? -3-2-1 0 1 2 3 So we know in advance of ever collecting a sample, that if sample size is sufficiently large: Repeated samples would pile up in a normal distribution The sample means will center on the true population mean The standard error will be a function of the population variability and sample size The larger the sample size, the more precise, or efficient, a particular sample is 95% of all sample means will fall between +/- 2 s.e. from the population mean What proportion of US teens know that 1492 was the year in which Columbus “discovered” America? A Gallup Poll fund that 210 out of a random sample of 501 American teens aged 1317 knew this historically important date. The sample proportion: p = 210/501 = 0.42 0.42 is the statistic that we use to gain information about the unknown population parameter p. We may say that 42% of US teens know that Columbus discovered America in 1492. Sampling distribution of sample proportion p = Count of success in sample Size of the sample = X n The mean of the sampling distributionp is exactly p The standard deviation of the sampling distribution p is p(1-p) n √ Applying to college p Normal calculation involving A polling organization asks an SRS (simple random sample) of 1500 1st year college students whether they applied for admission to any other college. In fact 35% of all the 1st year students applied to colleges besides the one they are attending. What is the probability that the random sample of 1500 students will give a result within 2 percentage point of this true value? n=1500 p=0.35 ℳ p =0.35 σ= √ = p(1-p) n √ 0.35(1-0.35) 1500 = 0.0123 Sampling Distribution Jeremy, out of boredom, decided to find the probability of a male student being 72 inches tall in BHS. Mr. Delton told him that the average height of 857 male students in BHS is 67 inches with a standard deviation of 3.5 inches. Show a statistical procedure on how to help Jeremy on his quest of getting rid of his boredom.