Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Assumptions and Conditions • Independence Assumption (Each condition needs to be checked for both groups.): – Randomization Condition: Were the data collected with suitable randomization (representative random samples or a randomized experiment)? – 10% Condition: We don’t usually check this condition for differences of means. We will check it for means only if we have a very small population or an extremely large sample. Assumptions and Conditions (cont.) • Normal Population Assumption: – Nearly Normal Condition: This must be checked for both groups. A violation by either one violates the condition. • Independent Groups Assumption: The two groups we are comparing must be independent of each other. If not, we will discuss this later in the lesson. Testing the Difference Between Two Means • The hypothesis test we use is the twosample t-test for means. • The conditions for the two-sample t-test for the difference between the means of two independent groups are the same as for the two-sample t-interval. Testing the Difference Between Two Means (cont.) We test the hypothesis H0:1 – 2 = 0, where the hypothesized difference, 0, is almost always 0, using the statistic x1 x2 0 t s12 s22 n1 n2 When the conditions are met and the null hypothesis is true, this statistic can be closely modeled by a Student’s t-model with a number of degrees of freedom given by a special formula. We use that model to obtain a P-value. Example: The Better Cookie Company claims its chocolate chip cookies have more chips than another chocolate chip cookie. 120 Better Cookies and 100 of the other type of cookie were randomly selected and the number of chips in each cookie was recorded. The results are as follows. Better Another Mean number of chips 7.6 6.9 Standard deviation 1.4 1.7 At the 2% level of significance, test the claim that the population of Better Cookies has a higher mean number of chips. 1. Hypothesis 1 = population mean number of chips from Better Cookie Co. 2 = population mean number of chips from another cookie company Ho: 1 = 2 Ha: 1 > 2 Example: The Better Cookie Company claims its chocolate chip cookies have more chips than another chocolate chip cookie. 120 Better Cookies and 100 of the other type of cookie were randomly selected and the number of chips in each cookie was recorded. The results are as follows. Better Another Mean number of chips 7.6 6.9 Standard deviation 1.4 1.7 At the 2% level of significance, test the claim that the population of Better Cookies has a higher mean number of chips. 2. Check Assumptions/Conditions • • • • SRS is stated is unknown, use t-distribution Population are independent We will assume an approximately normal distribution Example: The Better Cookie Company claims its chocolate chip cookies have more chips than another chocolate chip cookie. 120 Better Cookies and 100 of the other type of cookie were randomly selected and the number of chips in each cookie was recorded. The results are as follows. Better Another Mean number of chips 7.6 6.9 Standard deviation 1.4 1.7 At the 2% level of significance, test the claim that the population of Better Cookies has a higher mean number of chips. 3. Calculate Test x1 x2 0 7.6 6.9 0 t 3.29 s12 s22 n1 n2 1.42 1.72 120 100 P x1 x2 0.7 1 2 0 0.0006 0.0006 Example: The Better Cookie Company claims its chocolate chip cookies have more chips than another chocolate chip cookie. 120 Better Cookies and 100 of the other type of cookie were randomly selected and the number of chips in each cookie was recorded. The results are as follows. Better Another Mean number of chips 7.6 6.9 Standard deviation 1.4 1.7 At the 2% level of significance, test the claim that the population of Better Cookies has a higher mean number of chips. 4. Conclusion Since P-value is less than alpha, we reject that there is no difference in population mean number of chips between Better Cookie and another cookie company. Paired Data • Data are paired when the observations are collected in pairs or the observations in one group are naturally related to observations in the other group. • Paired data arise in a number of ways. Perhaps the most common is to compare subjects with themselves before and after a treatment. – When pairs arise from an experiment, the pairing is a type of blocking. – When they arise from an observational study, it is a form of matching. Paired Data (cont.) • If you know the data are paired, you can (and must!) take advantage of it. – To decide if the data are paired, consider how they were collected and what they mean (check the W’s). – There is no test to determine whether the data are paired. • Once we know the data are paired, we can examine the pairwise differences. – Because it is the differences we care about, we treat them as if they were the data and ignore the original two sets of data. Paired Data (cont.) • Now that we have only one set of data to consider, we can return to the simple onesample t-test. • Mechanically, a paired t-test is just a onesample t-test for the mean of the pairwise differences. – The sample size is the number of pairs. Assumptions and Conditions • Paired Data Assumption: The data must be paired. • Independence Assumption: The differences must be independent of each other. Check the: – Randomization Condition • Normal Population Assumption: We need to assume that the population of differences follows a Normal model. – Nearly Normal Condition: Check this with a histogram or Normal probability plot of the differences. The Paired t-Test • When the conditions are met, we are ready to test whether the paired differences differ significantly from zero. • We test the hypothesis H0: d = 0, where the d’s are the pairwise differences and 0 is almost always 0. The Paired t-Test (cont.) • We use the statistic xd 0 t sd n where n is the number of pairs. • When the conditions are met and the null hypothesis is true, this statistic follows a Student’s t-model on n – 1 degrees of freedom, so we can use that model to obtain a P-value. Example: A test of abstract reasoning is given to a random sample of students before and after they completed a formal logic course. The results are given below. Do the data suggest that the mean score after the course differs from the mean score before the course? Perform a test at the 5% significance level. Before 74 83 75 88 84 63 93 84 91 77 After 73 77 70 77 74 67 95 83 84 75 Before – After 1 6 5 11 10 -4 -2 1 7 2 1. Hypothesis d = population mean test score difference before and after completing logic course Ho: d = 0 Ha: d 0 Example: A test of abstract reasoning is given to a random sample of students before and after they completed a formal logic course. The results are given below. Do the data suggest that the mean score after the course differs from the mean score before the course? Perform a test at the 5% significance level. Before 74 83 75 88 84 63 93 84 91 77 After 73 77 70 77 74 67 95 83 84 75 Before – After 1 6 5 11 10 -4 -2 2. Check Assumptions/Conditions • SRS is stated • is unknown, use t-distribution • Populations are dependent Based on the linearity of the normal probability plot, we have an approximately normal distribution. 1 7 2 Example: A test of abstract reasoning is given to a random sample of students before and after they completed a formal logic course. The results are given below. Do the data suggest that the mean score after the course differs from the mean score before the course? Perform a test at the 5% significance level. Before 74 83 75 88 84 63 93 84 91 77 After 73 77 70 77 74 67 95 83 84 75 Before – After 1 6 5 11 10 -4 -2 1 7 2 3. Calculate Test xd 3.7 0 t 2.366 sd 4.945 10 n P xd 3.7 d 0 0.042 0.021 Example: A test of abstract reasoning is given to a random sample of students before and after they completed a formal logic course. The results are given below. Do the data suggest that the mean score after the course differs from the mean score before the course? Perform a test at the 5% significance level. Before 74 83 75 88 84 63 93 84 91 77 After 73 77 70 77 74 67 95 83 84 75 Before – After 1 6 5 11 10 -4 -2 1 7 2 4. Conclusion Since P-value is less than alpha, we reject that there is no difference in population mean test score difference before and after completing logic course.