Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 7-9 Peer Tutor Slides Instructor: Mr. Ethan W. Cooper, Lead Tutor © 2013 - - PLEASE DO NOT CITE, QUOTE, OR REPRODUCE WITHOUT THE WRITTEN PERMISSION OF THE AUTHOR. FOR PERMISSION OR QUESTIONS, PLEASE EMAIL MR. COOPER AT THE FOLLWING: [email protected] Chapter 7: Key Concepts This chapter is on the distribution of sample means. Sampling error is the natural discrepancy, or difference, between a sample statistic and its corresponding population parameter. The distribution of sample means, or sampling distribution, is the collection of sample means for all of the possible random samples of a particular size (n) that can be obtained from a population. Chapter 7: Key Concepts The mean of a sampling distribution, or expected value of M, should always be equal to the population mean (µ). The standard error of M (σM) is the standard deviation of the distribution of sample means. It provides a measure of how much distance is expected on average between a sample mean (M) and the population mean (µ). Standard Error of M: 𝜎𝑀 = 𝜎 𝑛 Chapter 7: Key Concepts The central limit theorem states that for any population with mean µ and standard deviation σ, the distribution of sample means for sample size n will have a mean of µ 𝜎 and a standard deviation of 𝜎𝑀 = and will approach a 𝑛 normal distribution as n approaches infinity. In other words, a sampling distribution has the following characteristics: A mean of µ. (Expected value of M) A standard deviation of 𝜎𝑀 = Is normal in shape if n is at least 30, or if the population distribution is normal in shape. 𝜎 . 𝑛 (Standard error of M) Chapter 7: Practice Question 1: A population has a mean of µ = 100 and a standard deviation of σ = 15. a) b) c) d) For samples of size n = 5, what is the expected value and the average difference between M and µ for the distribution of sample means? If the population distribution is not normal, describe the shape of the distribution of sample means based on n = 5. For samples of size n = 45, what is the expected value and the average difference between M and µ for the distribution of sample means? If the population distribution is not normal, describe the shape of the distribution of sample means based on n = 45. Chapter 7: Practice Question 1 Answers: a) Expected Value of M: µ = 100 Standard Error of M: 𝜎𝑀 = b) 𝜎 𝑛 = 15 5 = 6.71 The distribution of sample means does not satisfy either criterion to be normal. It would not be a normal distribution. If the population distribution was normal, or if the sample size was at least 30, the sampling distribution would be normal. Chapter 7: Practice Question 1 Answers: c) Expected Value of M: µ = 100 Standard Error of M: 𝜎𝑀 = 𝜎 𝑛 = 15 45 = 2.24 Notice how the expected value stays the same regardless of sample size, while the standard error decreases as sample size increases. d) Because the sample size is greater than 30, the distribution of sample means is a normal distribution. Chapter 7: Key Concepts Notice the relationship between sample size n and standard error 𝜎𝑀 . As n increases, 𝜎𝑀 decreases. Chapter 7: Key Concepts Again, as sample size increases, standard error decreases. Chapter 7: Key Concepts Standard Deviation (σ) 1 5 10 15 20 Standard Error 1 𝜎𝑀 = 100 5 𝜎𝑀 = 100 10 𝜎𝑀 = 100 15 𝜎𝑀 = 100 20 𝜎𝑀 = 100 0.10 0.50 1.00 1.50 2.00 As standard deviation increases, so does standard error of M. Chapter 7: Key Concepts The relationship between sample size and standard error is an important concept. Understanding this relationship can save you valuable time on the test. Chapter 7: Practice Question 2: If we have a sampling distribution with a standard error of 𝜎𝑀 = 5 and sample size of n = 15, what would the standard error be if we increased our sample size to n = 30? a) b) c) d) 𝜎𝑀 𝜎𝑀 𝜎𝑀 𝜎𝑀 = 7.12 = 6.57 = 3.54 = 5.64 Chapter 7: Practice Question 2 Answer: c) 𝜎𝑀 = 3.54 𝜎 𝜎𝑀 = 𝑛 𝜎 5= 15 𝜎 = 19.36 𝜎𝑀 = 𝜎 19.36 = = 3.54 𝑛 30 There are two ways to arrive at this answer. 1) Find the standard deviation, then re-calculate standard error for n = 30, or 2) Eliminate all of the other answer choices based on our rule: as sample size increases standard error decreases. In this case, answer choice c) is the only answer that is less than 5. Chapter 7: Sampling Distributions and Probability Very similar to what we did in chapter 6, we can calculate the probability of selecting a specific sample of size n by calculating the z-score and looking up the probability in the unit normal table. 𝑧= 𝑀−𝜇 𝜎𝑀 Chapter 7: Practice Question 3: What is the probability of obtaining a sample mean greater than M = 60 for a random sample of n = 16 scores selected from a normal population with a mean of µ = 65 and a standard deviation of σ = 20? What if we changed our sample size to n = 5? Chapter 7: Practice Question 3 Answer: Find 𝜎𝑀 . 𝜎 𝑛 = 20 16 = 20 4 = 5.00 Find the z-score. 𝜎𝑀 = 𝑧= 𝑀−𝜇 𝜎𝑀 = 60−65 5 = −5 5 = −1.00 Look up z = -1.00 in the unit normal table. (Column B) p(z > -1.00) = 0.8413 (or 84.13%) Chapter 7: Practice Question 3 Answer: Find 𝜎𝑀 . 𝜎 𝑛 = 20 5 20 = 2.24 = 8.94 Find the z-score. 𝜎𝑀 = 𝑧= 𝑀−𝜇 𝜎𝑀 = 60−65 8.94 −5 = 8.94 = −0.56 Look up z = -0.56 in the unit normal table. (Column B) p(z > -0.56) = 0.7123 (or 71.23%) Chapter 8: Key Concepts Chapter 8 covers hypothesis testing. Hypothesis tests allow us to make generalizations from samples to populations about whether our treatment has an effect. When there appears to be a treatment effect, one of two things has happened. The discrepancy between µ and M is the result of sampling error or, Our treatment had an effect. Chapter 8: Key Concepts There are 4 steps to hypothesis testing: 1) State the hypothesis. 1) 2) We state both the null hypothesis H0, which states that there is no change, and the alternative hypothesis H1, which states that there is a change. Set criteria for a decision. 1) Next, we divide our distribution up into two sections: 1) 2) 2) 3) Sample means close to the null hypothesis. Sample means very different from the null hypothesis. We use the alpha level, or level of significance, to mark the critical region, which is composed of sample values that are very unlikely to be obtained if the null is true. These boundaries are generally set at α = 0.05, α = 0.01, or α = 0.001 Chapter 8: Key Concepts Critical region for 2-tail test at α = 0.05. α Level z-Score .05 +/- 1.96 .01 +/- 2.58 .001 +/- 3.30 Chapter 8: Key Concepts Critical region for 1-tail test at α = 0.05. α Level z-Score .05 +/- 1.65 .01 +/- 2.33 .001 +/- 3.10 Chapter 8: Key Concepts There are 4 steps to hypothesis testing: 3) Compute the sample statistic. 1) 4) 𝑧= 𝑀−𝜇 𝜎𝑀 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛 −ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑧𝑒𝑑 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑀 𝑎𝑛𝑑 𝜇 Make a decision. 1) 2) 3) Does our sample statistic fall in the critical region? If yes, reject the null. The treatment has an effect. If no, fail to reject the null. The treatment does not have an effect. Chapter 8: Practice Question 1: What combination of factors will most likely lead to rejecting the null hypothesis? a) b) c) d) n = 30; α = 0.05 n = 5; α = 0.05 n = 30; α = 0.01 n = 5; α = 0.01 Chapter 8: Practice Question 1 Answer: a) b) c) d) n = 30; α = 0.05 n = 5; α = 0.05 n = 30; α = 0.01 n = 5; α = 0.01 A larger sample size leads to a smaller Standard error, which in turn gives us A larger z-score, increasing the likelihood That it will fall in the critical region. The larger alpha level increases the size of the critical region, making it more likely that our z-score will fall in this region. Chapter 8: Practice Question 2: On average, what would we expect our zscore to equal if the null hypothesis is true? Chapter 8: Practice Question 2 Answer: z = 0, indicating that there was no change in µ and the treatment had no effect. Remember that our null hypothesis states that µ doesn’t change, and the µ for the unit normal table (as with all z-distributions) is 0. Chapter 8: Practice Question 3: State the null and alternative hypotheses for a one-tailed test with µ = 50. We expect our treatment to have a positive effect. a) b) c) d) H0: µ = 50 H1: µ ≠ 50 H0: µ = 0 H 1: µ ≠ 0 H0: µ > 50 H1: µ ≤ 50 H0: µ ≤ 50 H1: µ > 50 Chapter 8: Practice Question 3 Answer: a) b) c) d) H0: µ = 50 H1: µ ≠ 50 H 0: µ = 0 H 1: µ ≠ 0 H0: µ > 50 H1: µ ≤ 50 H0: µ ≤ 50 H1: µ > 50 I often find it less confusing to state the alternative hypothesis first in a one-tailed test. We know that if there’s an effect, µ will get larger, µ > 0. However, if µ doesn’t get larger, it will either stay the same or get smaller, µ ≤ 0. Chapter 8: Key Concepts Occasionally, hypothesis tests can lead to error. There are two types of error in hypothesis testing: Type I and Type II. Type I error occurs when a researcher rejects a null hypothesis that is actually true. In these cases, the sample statistic falls in the critical region, not because of a treatment effect, but as a result of sampling error. Type I error is also known as a false positive. The probability of a Type I error is equal to the alpha level, or level of significance. Chapter 8: Key Concepts Type II error, on the other hand, occurs when a researcher fails to reject a false null hypothesis. This means that a treatment effect really exists, but the hypothesis test fails to detect it. This is often the case with very small treatment effects. Chapter 8: Key Concepts α 1-β 1-α β Chapter 8: Practice Question 4: Which of the following will decrease the risk of a type I error? a) b) c) d) e) f) Increasing the sample size (n) Decreasing the alpha from α = 0.05 to α = 0.01 Moving from a one- to a two-tailed test All of the above None of the above Some of the above Chapter 8: Practice Question 4 Answer: a) b) c) d) e) f) Increasing the sample size (n) Decreasing the alpha from α = 0.05 to α = 0.01 Moving from a one- to a two-tailed test All of the above None of the above Some of the above Increasing n reduces 𝜎𝑀 , making it less likely that the effect is due to chance. Decreasing alpha makes it more difficult to get into the critical region, reducing the chance of getting a type I error. Here, we reduced the probability of a type I error from 5% to 1%. Just like when we decrease alpha, moving from a one- to a two-tailed test reduces the probability of getting a type I error. Chapter 8: Practice Question 5: When are type II errors likely to occur? Chapter 8: Practice Question 5 Answer: When the treatment effect is very small. Remember: Type II errors occur when we fail to reject a false null hypothesis. In other words, our treatment has an effect, but for some reason our z-score did not fall in the critical region. Chapter 8: Key Concepts However, hypothesis testing doesn’t tell the whole story. It tells us whether there is a significant treatment effect, but not the size of the effect. To find effect size, we calculate Cohen’s d. 𝑀𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑀𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 −𝜇𝑛𝑜 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝜎 𝐶𝑜ℎ𝑒𝑛′ 𝑠 𝑑 = Effect sizes are summarized in the following chart. = Cohen’s d is not influenced by sample size. Chapter 8: Practice Question 6: A researcher selects a sample from a population with µ = 45 and σ = 8. A treatment is administered to the sample and, after treatment, the sample mean is found to be M = 47. What is the size of the treatment effect? Chapter 8: Practice Question 6 Answer: 𝑀𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 −𝜇𝑛𝑜 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝜎 𝐶𝑜ℎ𝑒𝑛′ 𝑠 𝑑 = This is a small effect. = 47−45 8 2 8 = = 0.25 Chapter 8: Key Concepts Power refers to the probability that a hypothesis test will correctly reject a false null hypothesis. That is, power is the likelihood that the test will identify a treatment effect if one really exists. Like hypothesis testing, power is a 4-step process: 1) Calculate standard error. 2) 𝜎𝑀 = 𝜎 𝑛 Locate boundary of critical region. M = µ + (Critical z-score * 𝜎𝑀 ) Chapter 8: Key Concepts 3) Calculate z-score for the difference between the treated sample mean for the critical region boundary and the population mean with the treatment effect. 4. 𝑧= 𝑀−𝜇 𝜎𝑀 Interpret power of the hypothesis test. Find probability associated with your z-score. Chapter 8: Key Concepts Chapter 8: Practice Question 7: What is the power of a hypothesis test if the probability of a type II error is β = 0.6046? Chapter 8: Practice Question 7 Answer: Power: 1 − β = 1 − 0.6046 = 0.3954 or 39.54% Chapter 8: Practice Question 8: What 3 factors increase power? Chapter 8: Practice Question 8 Answer: 1) 2) 3) Sample size is increased. Alpha is increased (e.g., from .01 to .05). You go from a 2- to a 1-tail test. Chapter 9: Practice Question 1: Which of the following is a fundamental difference between the t statistic and a z-score? a) b) c) d) The t statistic uses the sample mean in place of the population mean. The t statistic uses the sample variance in place of the population variance. The t statistic computes the standard error by dividing the standard deviation by n-1 instead of dividing by n. All of the above are differences between z and t. Chapter 9: Practice Question 1 Answer: a) b) c) d) The t statistic uses the sample mean in place of the population mean. The t statistic uses the sample variance in place of the population variance. The t statistic computes the standard error by dividing the standard deviation by n-1 instead of dividing by n. All of the above are differences between z and t. Without the population standard deviation or variance, we cannot calculate a z-score. Chapter 9: Practice Question 2: A sample of n = 25 is selected from a population with a mean of µ = 50. A treatment is administered to the individuals in the sample and, after treatment, the sample has a mean of M = 56 and a variance of s = 5. If all other factors are held constant and the sample size is increased to n = 25, is the sample sufficient to conclude that the treatment has a significant effect? (Use a two-tailed test with α = 0.05) Chapter 9: Practice Question 2 Answer: Step 1: State hypotheses H0: Treatment has no effect. (µ = 50) H1: Treatment has an effect. (µ ≠ 50) Chapter 9: Practice Step 2: Set Criteria for Decision (α = 0.05) t Critical: ± 2.064 Chapter 9: Practice df = 24 t Distribution with α = 0.05 Critical region t = - 2.064 Critical region t = + 2.064 Chapter 9: Practice a) b) Step 3: Compute sample statistic 𝑠𝑀 = 𝑡= 𝑠 𝑛 𝑀−𝜇 𝑠𝑀 = 5 25 5 5 = 56−50 1 = = 1.00 = 6 1 = 6.00 Chapter 9: Practice df = 24 t Distribution with α = 0.05 Critical region t = - 2.064 Critical region t = + 2.064 t = 6.00 Chapter 9: Practice Step 4: Make a decision For a Two-tailed Test: If -2.064 < tsample < 2.064, fail to reject H0 If tsample ≤ -2.064 or tsample ≥ 2.064, reject H0 tsample (6.00) > tcritical (2.064) Thus, we reject the null and conclude that the treatment has an effect. Chapter 9: Practice Question 3: A sample of n = 16 is selected from a population. A treatment is administered to the sample and, after treatment, the sample is found to be M = 86 with a standard deviation of s = 8. A confidence interval is constructed and the interval spans μlower = 81.738 and μupper = 90.262. How confident are we that our µ falls within this interval? Chapter 9: Practice Question 3 Answer: 1) Find the corresponding t-statistics for μupper and μlower. 𝑠 𝑛 = 8 16 8 𝑠𝑀 = = 4 = 2.00 𝑡= 𝑀−𝜇 𝑠𝑀 = 81.738−86 2 = −4.262 2 𝑡= 𝑀−𝜇 𝑠𝑀 = 90.262−86 2 = 4.262 2 = −2.131 = 2.131 Chapter 9: Practice Question 3 Answer: df = 15 Middle ?% of t distribution ?% in the lower tail t = - 2.131 ?% in the upper tail t = + 2.131 Chapter 9: Practice Question 3 Answer: 2) Use the t-distribution to find the alpha level (percentage between both tails) 100% - 5% = 95% Chapter 9: Practice Question 3 Answer: df = 15 Middle 95% of t distribution 2.5% in the lower tail t = - 2.131 2.5% in the upper tail t = + 2.131 Chapter 9: Practice Question 3 Answer: We are 95% confident that the true population mean is located within this interval: μ = 81.738 and μ = 90.262 .