Download Chi-Square Tests 1. Dice. After getting trounced by your little brother

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Chi-Square Tests
1. Dice. After getting trounced by your little brother in a
children's game, you suspect the die he gave you to
roll may be unfair. To check, you roll it 60 times,
recording the number of times each face appears. Do
these results cast doubt on the die's fairness?
Face
1 2 3 4 5 6
Count 11 7 9 15 12 6
a) If the die is fair, how many times would you
expect each face to show?
b) To see if these results are unusual, will you test
goodness-of-fit, homogeneity, or independence?
c) State your hypotheses.
d) Check the conditions.
e) How many degrees of freedom are there?
f) Find  2 and the P-value.
g) State your conclusion.
2. M&Ms. As noted in an earlier chapter, the
Masterfoods Company says that until very recently
yellow candies made up 20% of its milk chocolate
M&M's, red another 20%, and orange, blue, and green
are each 10%. The rest are brown. On his way home
from work the day he was writing these exercises, one
of the authors bought a bag of plain M&M's. He got
29 yellow ones, 23 red, 12 orange, 14 blue, 8 green,
and 20 brown. Is this sample consistent with the
company's stated proportions? Test an appropriate
hypothesis and state your conclusion.
a) If the M&M's are packaged in the stated
proportions, how many of each color should the
author have expected to get in his bag?
b) To see if his bag was unusual, should he test
goodness-of-fit, homogeneity, or independence?
c) State the hypotheses.
d) Check the conditions.
e) How many degrees of freedom are there?
f) Find  2 and the P-value.
g) State a conclusion.
3. Nuts. A company says its premium mixture of nuts
contains 10% Brazil nuts, 20% cashews, 20%
almonds, and 10% hazelnuts, and the rest are peanuts.
You buy a large can and separate the various kinds of
nuts. Upon weighing them, you find there are 112
grams of Brazil nuts, 183 grams of cashews, 207
grams of almonds, 71 grams of hazelnuts, and 446
grams of peanuts. You wonder whether your mix is
significantly different from what the company
advertises.
a) Explain why the chi-square goodness-of-fit test is
not an appropriate way to find out.
b) What might you do instead of weighing the nuts
in order to use a  2 test?
4. Mileage. A salesman who is on the road visiting
clients thinks that on average he drives the same
distance each day of the week. He keeps track of his
mileage for several weeks and discovers that he
averages 122 miles on Mondays, 203 miles on
Tuesdays, 176 miles on Wednesdays, 181 miles on
Thursdays, and 108 miles on Fridays. He wonders if
this evidence contradicts his belief in a uniform
distribution of miles across the days of the week.
Explain why it is not appropriate to test his hypothesis
using the chi-square goodness-of-fit test.
5. NYPD and race. Census data for New York City
indicate that 29.2% of the under-18 population is
white, 28.2% black, 31.5% Latino, 9.1% Asian, and
2% other ethnicities. The New York Civil Liberties
Union points out that of 26,181 police officers, 64.8%
are white, 14.5% black, 19.1% Hispanic, and 1.4%
Asian. Do the police officers reflect the ethnic
composition of the city's youth? Test an appropriate
hypothesis and state your conclusion.
6. Violence against women. In its study When Men
Murder Women, the Violence Policy Center reported
that 2129 women were murdered by men in 1996. Of
these victims, a weapon could be identified for 2013
of them. Of those for whom a weapon could be
identified, 1139 were killed by guns, 372 by knives or
other cutting instruments, 158 by other weapons, and
344 by personal attack (battery, strangulation, etc.).
The FBI's Uniform Crime Report says that among all
murders nationwide the weapon use rates were as
follows: guns 63.4%, knives 13.1%, other weapons,
16.8%, personal attack 6.7%. Is there evidence that
violence against women involves different weapons
than other violent attacks in the United States?
7. Fruit flies. Offspring of certain fruit flies may have
yellow or ebony bodies and normal wings or short
wings. Genetic theory predicts that these traits will
appear in the ratio 9:3:3:1 (9 yellow, normal: 3
yellow, short: 3 ebony, normal: 1 ebony, short). A
researcher checks 100 such flies and finds the
distribution of the traits to be 59, 20, 11, and 10,
respectively.
a) Are the results this researcher observed consistent
with the theoretical distribution predicted by the
genetic model?
b) If the researcher had examined 200 flies and
counted exactly twice as many in each category—
118, 40, 22, 20—what conclusion would he have
reached?
c) Why is there a discrepancy between the two
conclusions?
8. Pi. Many people know the mathematical constant  is
approximately 3.14. But that's not exact. To be more
precise, here are 20 decimal places:
3.14159265358979323846. Still not exact, though. In
fact, the actual value is irrational, a decimal that goes
on forever without any repeating pattern. But notice
that there are no 0's and only one 7 in the 20 decimal
places above. Does that pattern persist, or do all the
digits show up with equal frequency? The table shows
the number of times each digit appears in the first
million digits. Test the hypothesis that the digits 0
through 9 are uniformly distributed in the decimal
representation of .
The first million digits of 
Digit
Count
0
99959
1
99758
2
100026
3
100229
4
100230
5
100359
6
99548
7
99800
8
99985
9
100106
9. Titanic. Here is a table showing who survived the
sinking of the Titanic based on whether they were
crew members, or passengers booked in first, second,
or third-class staterooms:
Crew First Second Third Total
Alive 212 202 118 178 710
Dead 673 123 167 528 1491
Total 885 325 285 706 2201
a) If we draw an individual at random from this
table, what's the probability that we will draw a
member of the crew?
b) What's the probability of randomly selecting a
third-class passenger who survived?
c) What's the probability of a randomly selected
passenger surviving, given that the passenger was
a first-class passenger?
d) If someone's chances of surviving were the same
regardless of their status on the ship, how many
members of the crew would you expect to have
lived?
e) State the null and alternative hypotheses we
would test here.
f) Give the degrees of freedom for the test.
g) The chi-square value for the table is 187.8, and
the corresponding P-value is barely greater than 0.
State your conclusions about the hypotheses.
10. NYPD and gender. The table below shows the rank
attained by male and female officers in the New York
City Police Department. Do these data indicate that
men and women are equitably represented at all levels
of the department?
Rank
Male Female
Officer
21,900 4,281
Detective
4,058
806
Sergeant
3,898
415
Lieutenant
1,333
89
Captain
359
12
Higher ranks
218
10
a) What's the probability that a police officer
selected at random from the NYPD is a female?
b) What's the probability that a police officer
selected at random is a detective?
c) Assuming no bias in promotions, how many
female detectives would you expect the NYPD to
have?
d) To see if there is evidence of differences in ranks
attained by males and females, will you test
goodness-of-fit, homogeneity, or independence?
e) State the hypotheses.
f) Test the conditions.
g) How many degrees of freedom are there?
h) Find  2 and the P-value.
i) State your conclusion.
j) If you concluded that the distributions are not the
same, analyze the differences using the
standardized residuals of your calculations.
11. Cranberry juice. Its common folk wisdom that
drinking cranberry juice can help prevent urinary tract
infections in women. In 2001 the British Medical
journal reported the results of a Finnish study in which
three groups of 50 women were monitored for these
infections over 6 months. One group drank cranberry
juice daily, another group drank a lactobacillus drink,
and the third drank neither of those beverages, serving
as a control group. In the control group, 18 women
developed at least one infection compared with 20 of
those who consumed the lactobacillus drink and only
8 of those who drank cranberry juice. Does this study
provide supporting evidence for the value of cranberry
juice in warding off urinary tract infections?
a) Is this a survey, a retrospective study, a
prospective study, or an experiment? Explain.
b) Will you test goodness-of-fit, homogeneity, or
independence?
c) State the hypotheses.
d) Test the conditions.
e) How many degrees of freedom are there?
f) Find  2 and the P-value.
g) State your conclusion.
h) If you concluded that the groups are not the same,
analyze the differences using the standardized
residuals of your calculations.
12. Cars. A random survey of autos parked in student and
staff lots at a large university classified the brands by
country of origin, as seen in the table. Are there
differences in the national origins of cars driven by
students and staff?
Driver
Origin
Student Staff
American 107 105
European
33
12
Asian
55
47
a)
b)
c)
d)
e)
Is this a test of independence or homogeneity?
Write appropriate hypotheses.
Check the necessary assumptions and conditions.
Find the P-value of your test.
State your conclusion and analysis.
13. Montana. A 1992 poll conducted by the University of
Montana classified respondents by gender and political party, as shown in the table. We wonder if there is
evidence of an association between gender and party
affiliation
Democrat Republican Independent
Male
36
45
24
Female
48
33
16
a)
b)
c)
d)
e)
Is this a test of homogeneity or independence?
Write an appropriate hypothesis.
Are the conditions for inference satisfied?
Find the P-value for your test.
State a complete conclusion.
15. Montana revisited. The poll described in Exercise 13
also investigated the respondents' party affiliations
based on what area of the state they lived in. Test an
appropriate hypothesis about this table, and state your
conclusions.
Democrat Republican Independent
West
39
17
12
Northeast
15
30
12
Southeast
30
31
16
16. Working parents. In July 1991 and again in April
2001 the Gall up Poll asked random samples of 1015
adults about their opinions on working parents. The
table summarizes responses to the question
"Considering the needs of both parents and children,
which of the following do you see as the ideal family
in today's society?"
Both work full time
One works full time, other part time
One works, other works at home
One works, other stays home for kids
No opinion
1991
142
274
152
396
51
2001
131
244
173
416
51
a) Is this a survey, a retrospective study, a
prospective study, or an experiment? Explain.
b) Will you test goodness-of-fit, homogeneity, or
independence?
c) Based on these results, do you think there was a
change in people's attitudes during the 10 years
between these polls?
17. Grades. Two different professors teach an
introductory Statistics course. The table shows the
distribution of final grades they reported. We wonder
whether one of these professors is an "easier" grader.
A
B
C
D
F
Prof. Alpha Prof. Beta
3
9
11
12
14
8
9
2
3
1
a) Will you test goodness-of-fit, homogeneity, or
independence?
b) Write appropriate null hypotheses.
c) Find the expected counts for each cell, and
explain why the chi-square procedures are not
appropriate for this table.
19. Grades again. In some situations where the expected
cell counts are too small, as in the case of the grades
given by Professors Alpha and Beta in Exercise 17,
we can complete an analysis anyway. We can often
proceed after combining cells in some way that both
makes sense and produces a table in which the
conditions are satisfied. Here we create a new table
displaying the same data, but calling D's and F's
"Below C", as shown.
A
B
C
Below C
Prof. Alpha
3
11
14
12
Prof. Beta
9
12
8
3
a) Find the expected counts for each cell in this new
table, and explain why a chi-square procedure is
now appropriate.
b) With this change in the table, what has happened
to the number of degrees of freedom?
c) Test your hypothesis about the two professors,
and state an appropriate conclusion.
Answers
1. a) 10
b) Goodness-of-fit
c) HO: The die is fair (all faces have p = 1/6). HA:
The die is not fair.
d) Count data; rolls are random and independent;
expected frequencies all bigger than 5.
e) 5
f) f)  2 = 5.600, P-value = 0.3471
g) Because the P-value is high, do not reject H0. The
data show no evidence that the die is unfair.
2. a) Yellow and red: 21.2, orange, blue and green: 10.6,
brown: 31.8
b) Goodness-of-fit
c) H0: The distribution is as specified by the company.
HA: The distribution is not as specified.
d) Count data; Bag may not be a random sample, but
most likely representative; Expected counts are all
bigger than 5.
e) 3
f)  2 = 9-315, P-value = 0.0972
g) Because the P-value is high, do not reject H0. These
data do not provide evidence that the distribution is
other than specified.
3. a) Weights are quantitative, not counts.
b) Count the number of each kind of nut, assuming the
company's percentages are based on counts rather than
weights.
4. Data are averages, not counts.
5. H0: The police force represents the population (29.2%
white, 28.2% black, etc.). HA: The police force is not
representative of the population,  2 = 16516.88. df =
4, P-value = .0000. Because the p-value is so low, we
reject H0. These data show that the police force is not
representative of the population. In particular, there
are too many white officers in relationship to their
membership in the community.
6. H0: Murders among women have the same causes as all
murders (63.47% guns, etc). HA: Murder among
women have different causes than all murders,  2 =
479.508, df = 3, P-value < 0.0001. Because the Pvalue is so low, we reject H0. Women's murders do
not follow the same pattern of cause as all murders
nationwide. Women are much less likely to be killed
by other weapons and more likely lo be killed by
personal attack.
7. a)  2 = 5.671, df = 3, P-value = 0.1288. With a P-value
this high, we fail to reject H0. Yes, these data are
consistent with those predicted by genetic theory.
b)  2 = 11.342,df = 3, P-value = 0.0100. Because of
the low p-value, we reject H0. These data provide
evidence that the distribution is not as specified by
generic theory.
c) With small samples, many more data sets will be
consistent with the null hypothesis. With larger
samples, small discrepancies will show evidence
against the null hypothesis8. H0: Digits are all equally likely (all occur with
frequency 1/10]. HA: Digits are not all equally likely.
 2 = 5.509, df = 9, P-value = 0.7879. Because the Pvalue is large, we do not reject H0. These data provide
no evidence that the digits in pi are not all equally
likely.
9. a) 0 40.2% b) 8.1% c) 62.2% d) 285.48
e) H0: Survival was independent of status on the ship.
HA: Survival depended on the status
f) 3
g) We reject the null hypothesis. Survival depended
on status. We can see that first-class passengers were
more likely to survive than any other class.
10. a) 15.0%
b) 13.0%
c) 729.6
d) Independence
e) H0: Rank is independent of gender. HA: Rank and
gender are not independent.
f) Count data; not a random sample, but all NYPD
officers; expected counts all greater than 5.
g) 5
h)  2 = 290.131, P-value < 0.0001
i) Because the p-value is so low, we reject H0. Gender
and rank in the NYPD are not independent.
j) Standardized residuals are
Male Female
Officer
-2.3434 5.5747
Detective
-1.1759 2.7973
Sergeant
3.8429 -9.1421
Lieutenant 3.5824 -8.5223
Captain
2.4617 -5.8563
Higher ranks 1.7412 -4.1423
Women are overrepresented at the lower ranks and
underrepresented at every rank from sergeant up.
11. a) Experiment — actively imposed treatments
(different drinks)
b) Homogeneity
c) H0: The rate of urinary tract infection is (he same
for all three groups. HA: The rate of urinary tract
infection is different among the groups.
d) Count data; random assignment to treatments; all
expected frequencies larger than 5.
e) 2
f)  2 = 7.776, P-value = 0.020
g) With a P-value this low, we reject H0: These data
provide reasonably strong evidence there is a
difference in urinary tract infection rates between
cranberry juice drinkers, lactobacillus drinkers, and
the control group.
h) The standardized residuals are
Cranberry Lactobacillus Control
Infection
-1.87276 1.191759
0.681005
No Infection 1.245505 -0.79259
-0.45291
From the standardized residuals (and the sign of the
residuals), it appears those who drank cranberry juice
were less likely to develop urinary tract infections;
those who drank lactobacillus were more likely to
have infections.
12. a) Homogeneity
b) H0: The distribution of car origin is the same for
students and staff. HA: The distribution of car origin is
different for students and staff.
c) Count data; random survey of cars in lots (probably
can't generalize to other universities); expected
frequencies greater than 5.
d)  2 = 7-828, df = 2, P-value = 0.020.
e) With a P-value this low, we reject H(). The
distribution of car origins differs between students and
staff. Examination of the residuals shows that students
are more likely to drive European cars than staff and
less likely than staff to drive American cars.
13. a) Independence
b) H0: Political affiliation is independent of gender.
HA: There is a relationship between political affiliation
and gender.
c) Count data; probably a random sample, but can't
extend results to other states; all expected frequencies
greater than 5.
d)  2 = 4.851, df = 2, P-value = 0.0884
e) Because of the high P-value, we do not reject H0.
These data do not provide evidence of a relationship
between political affiliation and gender.
15. H0: Political affiliation is independent of region. HA:
There is a relationship between political affiliation and
region,  2 = 13.849, df = 4, P-value = 0.0078. With a
P-value this low, we reject H0. Political affiliation and
region are related. Examination of the residuals shows
that those in the West are more likely to be Democrat
than Republican; those in the Northeast are more
likely to be Republican than Democrat.
16. a) Survey
b) Homogeneity
c)  2 = 4.030, df = 4, P-value = 0.4019. Because the
P-value is so high, we fail to reject H0: These data do
not show evidence of a change in attitudes about the
ideal family between 1991 and 2001.
17. a) Homogeneity
b) H0: The grade distribution is (he same for both
professors. HA: The grade distributions are different.
c)
Dr. Alpha Dr. Beta
A 6.667
5.333
B 12.778
10.222
C 12.222
9.778
D 6.111
4.889
F 2.222
1.778
Three cells have expected frequencies less than 5.
19. a) Two cells have expected counts less than 5.
Dr. Alpha Dr. Beta
A
6.667
5.333
B
12.778
10.222
C
12.222
9.778
Below C 8.333
6.667
All expected frequencies are now larger than 5.
b) Decreased from 4 to 3.
c)  2 = 9.306, P-value = 0.0255. Because the P-value
is so low, we reject H0. The grade distributions for the
two professors are different. Dr. Alpha gives fewer
A's and more grades below C than Dr. Beta.