Download Chapter 2 Solutions Page 12 of 28

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Law of large numbers wikipedia , lookup

Probability amplitude wikipedia , lookup

Transcript
Chapter 2 Solutions
Page 12 of 28
2.34
Yes, a stem-and-leaf plot provides sufficient information to determine whether a dataset contains an outlier.
Because all individual values are shown, it is possible to see whether are any values are inconsistent with
the bulk of the data.
2.35
The answers will differ for each student.
a. You may be most interested in knowing the average value because it would provide some information
about what kind of salary to expect. You may also like to know the spread because the average value
would be less important if the annual salary of employees varied widely.
b. Each summary has interest here. The maximum would give information about whether an A is possible
with each instructor. For a generally average student, the average might have the most interest. The spread
would give information about whether most students performed about the same or whether there was great
variability among students.
c. The average may be the most informative about personal life expectancy, although the maximum and
the spread would also be useful and interesting general information.
2.36
The answers will differ for each student.
a. Most may prefer the data value to be near the average. If the data value were an outlier, the number of
children would be excessive, not something most would want (although some may prefer this).
b. An outlier on the high side seems preferable. Most would want to make more money than everyone else
does.
c. An outlier on the high side would be preferable because that would be great gas mileage. But, an outlier
on the low side is not desirable.
d. An outlier on the low side would be desirable because most would like to live in a town with a really
low crime rate. In reality, there may not be outliers on the low side because many small towns have low
crime rates, so a single low rate may not stand apart from other values. In that case, the average would be
desirable.
2.37
a. Mean = 74.33; median = 74.
b. Mean = 25; median = 7.
c. Mean = 27.5; median = 30.
2.38
100 is a large value compared to the rest of the values; it causes the mean to increase, while not affecting
the median.
2.39
a. 225 −123 = 102 lbs.
b. 190 −155 = 45 lbs.
c. 50%.
2.40
a.
b.
c.
d.
e.
2.41
a. (122+123)/2 = 122.5.
b. Q1 = 114 and Q3 = 129.5.
c. IQR = Q3 − Q1 = 129.5−114 = 15.5.
d. 1.5×IQR = 23.25. A number will be considered an outlier if it is either below 112.00−23.25 = 88.75 or
above 130.75+23.25 = 154. No values fit this criterion, so there are no outliers.
12 letters.
13 letters.
The IQR for males is 17−10 = 7 while for females it is 15−10 = 5. The IQR is larger for males.
23 − 6 = 17 letters.
23 − 6 = 17 letters.
Chapter 2 Solutions
Page 19 of 28
2.76
a. (100+120)/2 = 110 million.
b. 35 million.
c. 55 million.
d. Group 1, range = (200−2) = 198 million. Group 2, range = (95−8) = 87 million. Group 3 had a range
of (300-5) = 295 million. Therefore, Group 3 had the largest range and Group 2 had the smallest range.
2.77
a. Q1 = 35; Q3 = 150.
b. Q1 = 14; Q3 = 45.5.
c. Q1 = 23; Q3 = 150.
d.
Figure for Exercise 2.77d
2.78
a.
b.
c.
d.
2.79
a.
Telephone exchange is a categorical variable.
Number of telephones is a quantitative variable.
Dollar amount of last month’s phone bill is a quantitative variable
Long distance phone company used is a categorical variable.
Figure for Exercise 2.79
b. Range = maximum−minimum = 1949−1559 = 390 mm.
The range often spans a distance of about 6 standard deviations (see p.44 of the text), so the estimated
standard deviation is Range/6 =390/6 = 65 mm.
c. The approximation is appropriate if it is assumed that the heights follow a bell-shape.
d. The interval mean ± 3 standard deviations should cover about 99.7% of the values. This interval is
1732.5 mm ± (3 × 68.8 mm), which is 1732.5 mm ± 206.4 mm. The interval spans from 1526.1 mm to
1938.9 mm so it does not cover the maximum value of 1949 mm.
2.80
a. Yes, a variable can be both explanatory and categorical. The phrase "explanatory variable" means that
the variable might influence a response variable, and there is no restriction concerning whether the
explanatory variable is categorical or quantitative (or ordinal). For an example, consider Example 2.2 in
which type of nighttime lighting is a categorical explanatory variable.
Chapter 2 Solutions
Page 20 of 28
b. No, a variable cannot be both continuous and ordinal. The term "ordinal variable" is used when the raw
data are ordered categories while "continuous" means that all values in an interval are possible.
Supplemental note: Some might ask whether the term "ordinal" could apply to a continuous variable
because the raw data can be used to order the sample observations. A restriction of ordinal numbers,
however, is that all possible values can be counted. For a continuous variable, however, all possible values
in an interval cannot be counted. There always are an infinite number of values between any two points in
the interval so it's impossible to determine what "exact" value is second, third, and so on.
c. Yes, a variable can be both quantitative and a response variable. Generally, there is no restriction
concerning whether a response variable is quantitative, categorical, or ordinal. For an example, consider
Example 2.5 in which the quantitative response variable is right handspan (and the explanatory variable is
gender).
d. No, a variable cannot be both categorical and bell-shaped. It's not appropriate to assign any shape to the
distribution of a categorical variable because possible values are category labels that have no ordering.
Simply changing the order of listing the categories can change the "shape" of a bar chart.
e. Yes, a variable can be both bell-shaped and a response variable. An example is verbal SAT, which is
designed to have bell-shaped distribution, and would be a response variable in a comparison of the verbal
SAT scores of females and males.
2.81
a. This will differ for each student.
b. Kind of coin (penny, nickel, etc.) is an ordinal variable because the kinds can be ordered by their
monetary value.
c. The total monetary value of the coins is a quantitative variable.
d. “Kind of coin” is the data of interest when answering the question “Which coin occurred most often?”
“Total monetary value of the coins” is the data of interest in answering the question “What is the average
amount of change per student?”
2.82
This will differ for each student.
2.83
This will differ for each student.
2.84
a. The heights of all of the children in an elementary school will have a larger standard deviation because
there will be a wide variety of different aged students included. This will lead to a wide variety of heights.
b. The systolic blood pressure for 30 people who visit a health clinic in one day will have a larger standard
deviation because there is more variability from person to person as far as blood pressure is concerned than
for measurements made on the same person from day to day.
c. The SAT scores for the students in an honors class will have a larger standard deviation. Even though it
is an honors class and most students will do well on the SAT, the range of SAT scores is likely to be much
larger than the range of scores on an English final examination, which at can be at most 0 to 100. This
larger range will lead to a larger standard deviation.
2.85
a. The mean will be larger than the median. While most households may have between 0 and 4 or so
children, there will be some households with large numbers of children, so the distribution will be skewed
to the right.
b. The mean will be larger than the median. People like Bill Gates will create large outliers. And, generally
income data tends to be skewed to the right because high incomes can become quite high but incomes can't
be any lower than 0.
c. If all of the high school students are included, the mean will be higher than the median. This is because
many high school students are too young to work or do not want to work, resulting in many students with
$0 income earned in a job outside the home. There is even a chance the median could be 0!
d. The mean is 10.33 cents. Calculate this assuming there is one of each type of coin. The calculation is
(1+5+25)/3 = 31/3 = 10.33. The exact number of each type of coin doesn't matter. As long as there are
equal numbers of each type, the mean will be 10.33 cents. The median is the middle amount so it will be 5
cents. The mean is higher than the median because the monetary amounts are skewed to the right.
Chapter 7 Solutions
Page 1 of 19
CHAPTER 7
EXERCISE SOLUTIONS
7.1
Random Circumstance: Flight arrival time for a randomly selected flight on one of the top ten U.S.
airlines during that time period.
Ø Flight arrives on time (or early) with probability .761
Ø Flight arrives late with probability .239
7.2
1000/125000 =1/125, or .008
7.3
1/16 = 0.0625. After four students have been selected, sixteen remain as candidates, each with an
equal chance to be picked.
7.4
a.
b.
c.
d.
e.
7.5
a. Relative frequency probability; the proportion of times the outcome occurs in the long run.
b. Personal probability.
c. Relative frequency probability; proportion of a “large” random sample that falls into the
category of interest.
7.6
a. A car dealer has noticed that 1/25 (or .04) of new car buyers will return their cars for warranty
work within the first month.
b. A car dealer has noticed that 4% of new car buyers return their cars for warranty work within
the first month.
c. A car dealer has noticed that the probability is 1/25 (or .04) that a new car buyer will return the
car for warranty work within the first month.
7.7
a. The relative frequency interpretation of probability applies here. The probability was most
likely determined by observing the number of Americans injured by lightning during a number of
years and dividing this by the average population in those years.
b. The personal interpretation of probability applies here. The probability was determined from
the neighbor’s previous experience with tomato plants and her knowledge of the soil, sunlight and
other conditions where her plants are grown.
c. The relative frequency interpretation of probability applies here. The probability was
determined by observing many, many properly cared for tomato plants, counting the number of
plants that produced tomatoes, and dividing by the total number of plants observed.
d. The relative frequency interpretation of probability applies here. The probability was
determined by observing many U.S. couples and noting the proportion of couples in which the
husband outlived the wife.
7.8
Random Circumstance 1: Song on the radio when first turned on
Ø Robin’s favorite song is playing
Ø Robin’s favorite song is not playing
Random Circumstance 2: Color of traffic light when Robin approaches the main intersection
Ø Traffic light is green when Robin arrives
Ø Traffic light is red or yellow when Robin arrives
Random Circumstance 3: Nearest available parking space
Ø Robin finds an empty parking space in front of the building
Ø Robin does not find an empty parking space in front of the building
Yes.
Yes.
Yes.
No. A probability cannot be greater than 1.
No. A probability cannot be negative.
Chapter 7 Solutions
Page 2 of 19
7.9
For the first circumstance, Robin could repeatedly note whether or not her favorite song is playing
when she first turns on the radio. The probability that her favorite song is playing is the number of
times her favorite song is playing divided by the total number of days she did this.
For the second circumstance, Robin could repeatedly note whether or not the traffic light was
green when she arrived, and divide the number of times it was green by the total number of days
she did this.
For the third circumstance, Robin could repeatedly note whether or not she quickly found an
empty parking spot in front of the building, and divide the number of times she found a good
parking spot by the total number of times she did this.
7.10
a. The probability of a 6 = 1/6.
b. The probability of a 1 or 2 is 2/6 = 1/3.
c. The probability of an even number is the probability of a 2, 4, or 6. This probability = 3/6 =1/2.
7.11
An individual could determine his or her probability of winning this game by playing it a large
number of times and recording how many games he or she won out of the total number of games
played.
7.12
This will differ for each student.
7.13
This will differ for each student. One possible example of a situation in which a probability
statement makes sense, but for which the relative frequency interpretation could not apply is the
event that the Yankees win the World Series. The probability that the Yankees win the World
Series changes from year to year. The relative frequency method cannot be used to determine a
probability that is always changing.
7.14
No, this does not mean that Alicia will be called on to answer the first question exactly once
during the semester. In the long run, if this statistics class had many meetings, the proportion of
times that Alicia would be called on for the first question is 1/50. This semester she may be called
on 0, 1, 2, or even more times during the 50 class meetings.
7.15
John's reasoning is not correct. In the long run, if he repeatedly plays the lottery, the proportion of
times he would win is 1/1000. This does not mean that he will definitely win once every 1000
times he tries.
7.16
Of the children who slept in darkness, the number with myopia or high myopia is 15 + 2 = 17.
So, the probability = 17/172 = .0988 that a randomly selected child who slept in darkness would
develop some degree of myopia.
7.17
a. The approximate probability = 22/190 = .1158 that a randomly selected person will pick the
number 3.
b. The approximate probability = (2+6)/190 = 8/190 = .0421 that a randomly selected person will
pick either 1 or 10.
c. The approximate probability = (2+22+18+56+14)/190 = 112/190 = .5895 that a randomly
selected person will pick an odd number.
7.18
a. BY, BS, BA, YS, YA, SA.
b. 1/6.
7.19
a. Yes, because they don’t contain any of the same outcomes (simple events). Part of the
definition of A c is that it does not contain any of the same simple events as A.
b. No, they are dependent events. If A occurs then A c cannot occur. If A does not occur, then Ac
must occur.
Chapter 7 Solutions
Page 3 of 19
7.20
a. Yes. The outcome for one coin does not affect the probabilities for the other coin.
b. No. The outcomes for the two coins apply to separate random circumstances (the outcome for
each coin is one random circumstance) and complementary events are defined only for the same
random circumstance.
c. No. A particular outcome of the nic kel, for instance, doesn’t exclude any outcome of the penny
from occurring when both coins are flipped.
7.21
a. No. Mutually exclusive events with positive probability can never be independent. Because A
and B are mutually exclusive events, P(A and B) = 0. If A and B were independent, it would be
true that P(A and B) = P(A)P(B) = (1/2)(1/3) = 1/6, which is not the case in this problem. A
different “proof” is that if we knew that A had occurred, then we would know that B could not
have occurred because they are mutually exclusive. This makes them dependent events as P(B|A)
= 0 is not equal to P(B) = 1/3.
b. No. Probabilities for complementary events must add to 1 but P(A) + P(B) = (1/2) + (1/3) =
5/6 ≠ 1.
7.22
a. Yes. The probability for each event is between 0 and 1, and over all possible outcomes, the sum
of the probabilities equals 1.
b. No. The sum of the probabilities exceeds 1.
c. Yes. The sum of the probabilities is less than 1, and each individual probability is between 0
and 1.
7.23
a. {(J,J), (J,L), (L,J), (L,L)}
b. Yes, each simple event has probability = .25, calculated as (.5)(.5). On each flip, each person
has probability =.5 that they will pay.
c. P{(L,L)} = ¼.
7.24
a. Yes, B and C are independent because the outcome of the drawing one week in unrelated to the
outcome the next week.
b. Now B and C are not independent. If Vanessa wins in week 1, her card from that week will no
longer be part of the drawing the next week, thus changing her probability of winning in week 2.
7.25
a. Yes, these are disjoint events because the red die cannot be a 3 and a 6 at the same time.
b. No, these are not disjoint events. The red die can be 3 in the same toss that the green die is a 6.
c. No, these are not disjoint events. Both A and B occur when the red die is a 3 and the green die
is a 1.
d. Yes, there are disjoint events. The dice cannot possibly sum to 4 if the red die is a 4.
7.26
a. No, they are dependent. For instance, if the red die is known to be a 3, the red die cannot
possibly be a 6.
b. Yes, they are independent. Knowing whether the red die is a 3 or not does not alter the
probability the green die is a 6.
c. No, they are dependent. If the dice are known to sum to 4, the probability the red die is a 3 is
affected because the red die could then not possibly be 4, 5, or 6.
d. No, they are dependent. If the dice sum to 4, for instance, the red die could not possibly be a 4.
7.27
A = event that a woman is between the ages of 20 and 24.
B = event that a woman is between the ages of 40 and 44.
C = event a woman can bear a child.
P(C|A) = .90. P(C|B) = .37.
7.28
Observing the relative frequency was the method used to find the 90 percent chance and 37
percent chance. Researchers probably repeatedly observed women of various age groups trying to
bear children and recorded the proportion who were able to do so.
Chapter 7 Solutions
Page 4 of 19
7.29
Age and fertility status are dependent. The probability of being fertile changes with age.7.30
a. For each of C1 , C2 , and C3 , the unconditional probability is 1/50. Prior to any
selections, each of the 50 students has the same chance to be picked for any question.
b. P(C3 |C1 ) = 0. If Alicia is picked for the first question, her name is taken out of the bag so she
cannot possibly be picked for the third question.
c. No, C1 and C3 are not independent events. They are dependent because P(C3 ) = 1/50, but
P(C3 |C1 )=0. The probability for C3 changes if it is known that C1 has happened.
7.31
On any given day, 3 of the 50 students are chosen to answer questions. If the drawing is fair, each
of the 50 students has the same probability to be picked for any one of the three questions. So,
prior to any draws, the probability is 3/50 that Alicia will be one of the 3 students picked.
7.32
a. A C = flipping either 0, 1, or 2 heads in the 3 tosses of the coins. Put another way, it is the event
that at least one tail occurs in the three flips.
b. 7/8. This can be found as 1− P(A) = 1−(1/8) = 7/8.
7.33
a. Not getting the same number on both dice.
b. 5/6.
7.34
a.
b.
c.
d.
7.35
a. No, they are not independent. P(A in both classes) P(A English)×P(A in history), as it would
for independent events.
b. P(A in either English or history) = P(A in English) + P(A in history) − P(A in both classes) =
.70 + .60 − .50 = .80.
7.36
a. P(A) = .55; P(A c) = .45; P(B|A) = .80; P(B|A c) = .10.
b. P(A and B) = P(A)P(B|A) = (.55)(.80) = .44. This is the probability of being a Republican and
voting for Candidate X.
c. P(A c and B) = P(A c) P(B|A c) = (.45)(.10) = .045. This is the probability of being a nonRepublican and voting for Candidate X.
d. P(B) = P(A and B) + P(A c and B) = .485.
e. Candidate X received 48.5% of the votes.
7.37
a. With replacement, because each digit can be 0, 1, ..., 9 even if that number has been used.
b. Without replacement, because 3 different students need to be chosen.
c. Without replacement, because 5 different people are needed.
d. Without replacement, because 2 different teams are needed.
e. With replacement, because if you happen to be unlucky, the police officer could stop you more
than just one day.
7.38
a. Probability = 11/12 that the first stranger does not share your birth month.
b. Probability = 11/12 that the second stranger does not share your birth month.
c. Probability = (11/12)(11/12) = 121/144 = .84 that neither shares your birth month. Use the
multiplication rule for two independent events (Rule 3b).
d. P(at least one) = 1−P(neither) = 1−.84 = .16, or 23/144.
The event that at least one of the two shares your birth month is the complement of the event that
neither does.
7.39
a. P(first stranger shares you birth month) = 1/12.
b. P(second stranger shares your birth month) = 1/12
c. P(both share your birth month) = (1/12)(1/12) = 1/144
Use the multiplication rule for two independent events (Rule 3b).
½.
½.
(½)(½) = ¼.
½ + ½ ¼ = ¾.
Chapter 7 Solutions
Page 5 of 19
7.39 continued:
d. A = first stranger shares your birth month; B = second stranger shares your birth month
P(either A or B)= P(A) + P(B) - P(A and B)
P(either A or B) = (1/12) + (1/12) - (1/144) = 23/144 = .16.
7.40
P(one of each) = 1− P(two of same kind) = 1 − .5002 = .4998. The event that she has one of each
sex is the complement of the event that she has two children of the same sex.
7.41
a. These probabilities were determined by observing the relative frequency. The travel planner
observed a large number of those specific flights and recorded the proportion of those flights that
arrived on time for the ship.
b. Whether Harold’s plane is on time is probably not independent of whether Maude’s plane is on
time. Bad weather conditions cause many flight delays. If there is bad weather and Harold’s
plane is delayed, there is probably a higher chance that Maude’s plane will be delayed.
c. P(both arrive on time) = (.8)(.9) = .72
Use the multiplication rule for two independent events (Rule 3b).
d. Each of the pair can be on time or late, so there are four mutually exclusive options, listed with
their probabilities in the table below. The outcomes that result in one of them cruising alone are in
bold. The outcomes are disjoint, so P(one cruises alone) = .18 + .08 = .26
Maude is on time (.9)
Maude is late (.1)
Harold is on time (.8)
(.8)(.9) = .72
(.8)(.1) = .08
Harold is late (.2)
(.2)(.9) = .18
(.2)(.1) = .01
7.42
a. The issue involved is whether any of the traits of driving a red pickup truck, being a smoker
and having blond hair are related. It seems reasonable to assume the three traits are independent,
but we can't know for certain. Perhaps people who drive red vehicles or people who drive pickup
trucks are more likely to be risk takers, as are people who smoke.
b. P(red pickup truck and smokes and blond)= (1/50)(.30)(.20) = .0012. Use the multiplication
rule for independent events (Rule 3b extension).
c. Number fitting the description of the criminal = 10,000(.0012) = 12. The value of the
proportion (.0012) was determined in part (b).
d. Probability = 11/12 = .917 that the driver arrested by the police is innocent. There are 12
vehicle owners who fit the description. Assuming the description is accurate, one of these 12 is
guilty and the other eleven are innocent.
e. The answer to part (d) suggests an argument against the prosecutor's reasoning. While the
evidence narrows the possibilities down to only twelve people, eleven of these twelve people are
innocent. Conditional on the given evidence (and only the given evidence), the probability is high
(.917) that the arrested person is innocent.
7.43
a. P(both are friends of president) = (10/40)(10/40) = 1/16 = .0625.
The probability of picking a friend is the same for each draw because sampling is with
replacement.
b. P(both are friends of president) = (10/40)(9/39) = .0577.
Note that the second probability (9/39) is the conditional probability of a friend given that the first
selection was a friend.
c. P(neither are friends of president) = (30/40)(30/40) = .5626
The probability of not picking a friend is the same for each draw because sampling is with
replacement.
d. P(neither are friends of president) = (30/40)(29/39) = .5577
Note that the second probability (29/39) is the conditional probability the second selection is not a
friend given the first selection is not a friend.