Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The χ²-test When a problem involves multiply categories, we use the χ²-test. Introduction • We know that when a problem only involves two categories, the z-test is appropriate. • For instance, in the ESP experiment, we only classified each guess being correct or incorrect. • Here, “correct” and “incorrect” are the two categories in this problem. We modeled the 0-1 box, 1 represented “correct”, and 0 represented “incorrect”. Then we could look at the sum of 1’s to count the frequency of the guesses being correct. • However, when a problem involves more than two categories, we have to use the χ²-test. • For instance, you might want to see if a die is fair, then there are 6 categories. The χ²-test will help to check whether these categories are equally likely. Example • A gambler is accused of using a loaded die, but he pleads innocent. • A record has been kept of the last 60 throws: Example • If we only focus on one line of the table, say, the number of 3’s. • The SE for this number is 60 × 1/6 × 5/6 ≈ 2.9. Then the observed number is about 2.4 SEs above the expected number. • But different lines have different variance. This could happen even if the die is fair. • The idea is to combine all these differences into one overall measure of the distance between the observed and expected values. • The χ² is to square each difference, divide by the corresponding expected frequency, and take the sum. Example • Here is the formula: χ² = ∑ (𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 − 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦)² . 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 • From the formula we see, when the observed frequency is far from the expected frequency, the corresponding term in the sum is large; when the two are close, this term is small. • So the χ² measure the distances between the observed and expected values. • In our example, using the above formula, there is one term for each line in the table. With the data, the χ²-statistic is: • (4−10)² 10 (6−10)² + 10 (17−10)² + 10 + (16−10)² 10 + (8−10)² 10 + (9−10)² 10 = 142 10 = 14.2. Example • With the χ²-statistic, we need a curve to approximate the probability, that is the P-value. • The χ²-curve: is a bunch of curves. More precisely, there is one curve for each number of degrees of freedom. Example • To figure out the degrees of freedom, we use the following formula: • Degrees of freedom = number of terms in χ² − one. • To figure out the P-value, we look at the area to the right of the χ²statistic under the χ²-curve: from the formula of χ², we see that the values to the right of χ² represent the more extreme ones to the observed value. • In our example: there are 6 – 1 = 5 degrees of freedom. Example • To find the area, one needs to read the χ² table: • For instance, look at the column for 1% and the row for 5 degrees. • It reads 15.09, meaning that the area to the right of 15.09 under the curve for 5 degrees of freedom is about 1%. • So in our example, the P-value is just a bit more than 1%. (χ² = 14.2) χ² approximation • From probability theory, we actually compute the exact probability for each value of the χ²-statistic. • We see from the graph, the probability histogram indeed follows the χ²-curve: Difference between z and χ² • If it matters how many tickets of each kind are in the box, then we use the χ²-test. (involve multiply categories and we know the content) • For instance, fair die test, fair coin test, and etc. • If it is about the average or the sum of the box, then we use the ztest. (including counting two categories and we only know the average/sum) • For instance, fair coin test, average of heights test, and etc. Another Example • The χ²-test can also be used to test for independence: • Are handedness and sex independent? • Take people age 25-34 in the U.S. The question is whether the distribution of “handedness” (right-handed, left-handed, ambidextrous) among the men in this population differs from the distribution among the women. • Here is a probability sample of 2,327 Americans 25-34: Another Example • In order to compare the distribution of handedness for men and women, it is better to look at the data in percentage: • To make a χ²-test of the null hypothesis, we have to compare the observed frequencies with the expected frequencies (based on the null). Another Example • Figuring out the expected frequencies takes some effort here: • The null says handedness and sex are independent, then the difference in distribution is due to chance. • So the expected percentage of each type of handedness for each gender must be the same as the percentage of each type of handedness for the whole population. This percentage can be estimated from the sample. 2,004 • For instance, the percentage of right-handers is × 100% ≈ 89.6%. Then the expected 2,237 frequency of right-handed men is 89.6% × 1,067 ≈ 956. Another Example • Therefore, we obtain the following table of data: • There are 2×3 categories, and the χ²-statistic can be computed as: • (934−956)² (1,070−1,048)² (113−98)² (92−107)² (20−13)² (8−15)² + + + + + 956 1,048 98 107 13 15 ≈ 12. Another Example • The formula for calculating the degrees of freedom has to be changed: • When testing independence in an m × n table: • Degrees of freedom = (𝑚 − 1) × (𝑛 − 1). • So in our example, it is only 3 − 1 × 2 − 1 = 2 degrees of freedom. • From the table, we see the P-value: Another Example • The P-value is only about 0.2% < 1%. It is highly significant. • So the null hypothesis should be rejected. The observed difference in the sample seems to reflect a real difference in the population, rather than chance variation. • Therefore, we have the conclusion: • There is strong evidence to show that the distribution of handedness among the men in the population is different from the distribution for women.