Download ccc

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
MATH-138 In-class Practice Problems
1. Suppose a basketball player scored the following number of points in his last 15
games: 4, 4, 3, 4, 7, 16, 12, 23, 15, 8, 5, 18, 8, 29, 21.
Fill in the following frequency (and relative frequency) distribution.
Bin
1-6
7-12
13-18
19-24
25-30
Total
Frequency
5
4
3
2
1
15
Relative Frequency
33%
27%
20%
13%
7%
100%
2.
a. What percentage of games did the player score 12 points or less? 60%
b. What percentage of games did the player score between 7 and 18 points (inclusive i.e.
7<=points<=18)? 47%
3. If you were to draw a histogram from your frequency distribution (from Question 1),
would it be skewed to the right or left? That is, is this distribution skewed right or left?
Right
4. Calculate the following statistics from the basketball scores: Mean, Median, Quartile 1,
Quartile 3, Minimum, Maximum, Range, IQR, and Standard Deviation.
Mean= 11.8
Median= 8
Standard Deviation = 8.2
Minimum = 3
Q1 = 4
Q3 = 18
Maximum = 29
Range = 26
IQR = 14
5. Construct a boxplot for the above-mentioned basketball scores.
6. Use the above-mentioned basketball scores to calculate the z-scores for the 3 lowestscoring and 3 highest-scoring games.
Lowest 3 scoring games: 3 -> -1.07; 4 -> -0.95; 4 -> -0.95
Highest 3 scoring games: 21 -> 1.12; 23 -> 1.37; 29 -> 2.10
7. A college student received a score of 78 on her Math exam and a score of 86 on her
French exam. The overall results on the French exam had a mean of 82 and a standard
deviation of 8, while the math exam had a mean of 54 and a standard deviation 12. On
which exam did she do relatively better?
Math z-score: 2.0
French z-score: 0.5
She did relatively better on her math exam.
8. Who is relatively taller:
A. A non-basketball playing man who is 75 inches tall (assume non-basketball
playing men have a mean height of 71.5 inches tall and a standard deviation of 2.1
inches).
B. A male basketball player who is 85 inches tall (assume male basketball players
have a mean height of 80 inches and a standard deviation of 3.3)
The non-basketball playing man is relatively taller (his z-score is 1.67 vs. the
basketball player’s z-score of 1.51).
9. Assume verbal SAT scores have a mean of 500 and a std. dev. of 100. What is the zscore of somebody who scores 500 on the verbal portion of the SAT?
0
10. Assume IQ scores have a mean of 100 and a std. dev. of 16. Albert Einstein
reportedly had an IQ of 160. What is the z-score of his IQ?
3.75
Assume IQ scores have a normal model distribution with a mean of 100 and a std. dev. of
16. Use this information to answer Questions 11-14.
11. Find the following percentages:
a. % of people with 84<=IQ<=116 68%
b. % of people with IQ>=100 50%
c. % of people 68<=IQ<=132 95%
d. % of people who are “geniuses” (a genius is someone with an IQ>=132) 2.5%
e. % of people 84<=IQ<=132 81.5%
12. Find the following percentage:
a. % of people with IQ<=125 94.1%
b. % of people with 90<=IQ<=110 46.8%
c. % of people with 110<=IQ<=120 16.0%
13. What value (IQ score) separates the bottom/lower/not-so-smart 10% of the population
from the top/upper/smarter 90%? 79
14. What value (IQ score) separates the top/smarter 35% of the population from the
bottom/not-so-smart 65%? 106
Suppose the IQ scores of people who smoke copious amounts of pot are normally
distributed with mean=90 and standard deviation=20 (assume IQ scores are continuous
and not necessarily integers). Suppose the IQ scores of people who don’t smoke copious
amounts of pot are normally distributed with mean=100 and standard deviation=16. Use
this information to answer Questions 15-18.
15. Find the following percentages:
a. A former psychological classification of mental retardation labeled someone with an
IQ score between 50 and 68 as a “moron”. What percentage of copious pot smokers are
“morons”? 11.3%
b. What percentage of non-copious pot smokers are NOT “morons”? 97.8%
c. What percentage of copious pot smokers have an IQ score within 1 standard deviation
of the mean? 68%
d. What percentage of non-copious pot smokers have IQ scores within 2 standard
deviations of the mean? 95%
16. Find the following percentages:
a. The percentage of non-copious pot smokers who have an IQ score higher than the
mean of copious pot smokers? 73.4%
b. The percentage of non-copious pot smokers who are geniuses (a “genius” is somebody
with an IQ of 132 or higher)? 2.3% (2.5% is an acceptable answer)
17. What is the IQ score of the following people (round to the nearest integer):
a. A copious pot smoker who is smarter than 90% of all the other copious pot smokers
116
b. A non-copious pot smoker who is dumber than 90% of all the other non-copious pot
smokers 79
18. What IQ score will separate the smarter half of the copious pot smokers from the
dumber half? 90
19. The following data represents movie budgets vs. gross revenue (in million $) for 7
movies. Create a scatterplot to see if r should be calculated. If so, what is r (Triola 2008)?
Budget
62
90
50
35
200
100
90
Gross
65
64
48
57
601
146
47
There appears to be a positive, linear relationship. The value of r, 0.93, confirms
this.
20. The following data represents supermodel heights (inches) vs. weights (pounds) for 9
supermodels. Create a scatterplot to see if r should be calculated. If so, what is r (Triola
2008)?
Height
70
70.5
68
65
70
70
70
70
71
Weight
117
119
105
115
119
127
113
123
115
The scatterplot does not show a linear relationship. Therefore, r does not need to be
calculated. If you were to calculate r, you would see that it equals 0.36.
21.
a. Estimate the regression equation for Question #19 (use budget as your independent,
“x” variable)
Predicted Revenue = (3.47*Budget)-164.14
b. Interpret the slope and intercept in the context of this problem
The model predicts that for each $1 million increase in the movie budget, revenue
will increase by $3.47 million.
c. How much gross revenue does the regression line predict a movie with a $95 million
budget will make?
$165.5 million
d. What is the residual for the movie in the data that had a $100 million budget?
-$37 million
22. A basketball player makes 80% of her free throws. Suppose she wakes up every
morning and starts shooting free throws. On an “average” morning, on which free throw
will she have her first miss? Perform 20 trials.
For this simulation we will generate random #’s from 1-5. Call any # from 1-4 a
“made shot” and a 5 a “missed shot”. For each trial, generate #’s until she “misses”
a shot. Note: the below trials were gotten using my calculator. If/when you repeat
this experiment, you will get different results (since we are dealing with random
#’s).
Trial 1: 5 (She missed on her first shot)
Trial 2: 1, 3, 3, 4, 1, 2, 5 (She missed on her sixth shot)
Trial 3: 2, 4, 5 (She missed on her third shot)
Trial 4: 2, 2, 1, 5 (She missed on her fourth shot)
Trial 5: 1, 1, 3, 5 (She missed on her fourth shot)
Trial 6: 5 (She missed on her first shot)
Trial 7: 2, 2, 1, 1, 4, 1, 3, 2, 5 (She missed on her ninth shot)
You would want to do at least 30 trials and then get the average # of times it took
her to miss. In the above example (with 7 trials), on average, she missed on her 28/7
= 4th shot.
23. You are going to take a quiz with 5 multiple choice questions. You estimate that you
have a 80% chance of getting any question right. What are you chances of getting them
all right? Perform 20 trials.
For this simulation we will generate random #’s from 1-5. Call any # from 1-4 a
“correctly answered question” and a 5 an “incorrectly answered question”. For each
trial, generate 5 #’s (corresponding to the 5 multiple choice questions). Note: the
below trials were gotten using my calculator. If/when you repeat this experiment,
you will get different results (since we are dealing with random #’s).
Trial 1: 5, 4, 2, 5, 4 (not all correct)
Trial 2: 2, 2, 5, 5, 1 (not all correct)
Trial 3: 1, 4, 5, 4, 2 (not all correct)
Trial 4: 1, 5, 1, 2, 1 (not all correct)
Trial 5: 2, 5, 2, 3, 4 (not all correct)
Trial 6: 1, 1, 1, 4, 4 (ALL correct)
Trial 7: 5, 2, 1, 3, 5 (not all correct)
Trial 8: 4, 3, 4, 5, 4 (not all correct)
Trial 9: 4, 3, 2, 2, 4 (ALL correct)
Trial 10: 5, 1, 3, 3, 2 (not all correct)
You would want to do at least 30 trials and then get the percentage of times that all
questions were answered correctly. In the above example (with 10 trials), all
questions were answered correctly 2/10 = 20% of the time.
24. You are going to take a quiz with 10 multiple choice questions, where each question
has 4 answer choices. You have not studied and you need to guess on each question.
What are your chances of passing the quiz (i.e. getting at least 6 out of 10 questions
correct on the quiz)? Perform 20 trials.
For this simulation we will generate random #’s from 1-4. Call any # from 1-3 an
“incorrectly answered question” and a 4 a “correctly answered question”. For each
trial, generate 10 #’s (corresponding to the 10 multiple choice questions). Note: the
below trials were gotten using my calculator. If/when you repeat this experiment,
you will get different results (since we are dealing with random #’s).
Trial 1: 4, 4, 2, 2, 1, 3, 3, 4, 4, 2 (4 correct questions . . . FAIL)
Trial 2: 4, 3, 2, 2, 4, 4, 2, 3, 2, 2 (3 correct questions . . . FAIL)
Trial 3: 1, 1, 2, 3, 1, 3, 1, 3, 2, 1 (0 correct questions . . . FAIL)
Trial 4: 4, 4, 2, 3, 1, 1, 3, 1, 3, 4 (3 correct questions . . . FAIL)
Trial 5: 4, 2, 1, 2, 3, 2, 4, 3, 1, 2 (2 correct questions . . . FAIL)
You would want to do at least 30 trials and then get the percentage of times that at
least 6 questions were answered correctly. In the above example (with 5 trials), the
student passed the quiz 0/5 = 0% of the time.
25. Suppose we wanted to study how many credit hours HCC credit students are taking
this semester. How would we get a simple random sample (SRS) of HCC students?
Stratified sample? Cluster sample? Systematic sample? Convenience sample? Census?
SRS: Assign every student a unique #, and generate some random #’s.
Strat: Divide students into day and evening, and do a simple random sample on
each group.
Clust: Go to a randomly chosen building, and do a census of all students in the
building.
Sys: List students alphabetically, and sample every third student on list.
Conv: Offer free drinks/food to HCC students and then ask them to tell you how
many credit hours they are taking.
Census: Consult HCC database and get information on every student.
26. Which of the following (if any) are NOT valid probability values?
a. 0.40
b. -0.20
c. 1.00
d. 0.99999
27. Which of the following (if any) are NOT valid probability values?
a. 0.00
b. 0.67
c. 0.80
d. 1.14
Suppose I want to perform the random procedure of rolling a fair die (rolling it once).
Use this procedure to answer Questions 28-34.
28. What is the sample space, S, for the above-mentioned procedure? S={1,2,3,4,5,6}
Suppose the following events:
a. Rolling a 1
b. Rolling anything but 6
c. Rolling something less than 2
d. Rolling something between 2 and 5, inclusive (i.e. including 2 and 5)
29. What are the probabilities for the four events listed above?
a. 1/6
b. 5/6
c. 1/6
d. 4/6
30. What are the complements of the four events listed above?
a. Rolling something between 2 and 6, inclusive
b. { 6 }
c. { 2, 3, 4, 5, 6 }
d. Rolling a 1 or a 6
31. What are the probabilities of the four complements?
a. 5/6
b. 1/6
c. 5/6
d. 2/6
32. State whether each of the following pairs of events (from above) are mutually
exclusive or not:
a. Events a and b no
b. Events a and c no
c. Events b and d no
d. Events c and d yes
33. For the above die-rolling example, give an example of an impossible event.
Rolling a seven
34. For the above die-rolling example, give an example of a certain event.
Rolling a # between 1 and 6, inclusive
The table below describes a standard deck of cards. Use the table to answer Questions 3536. Note that I am counting aces as face cards.
Clubs (black)
Spades (black)
Hearts (red)
Diamonds (red)
Face Cards
4
4
4
4
Non-Face Cards
9
9
9
9
35. State whether each of the following pairs of events are mutually exclusive or not:
a. Black cards and red cards Yes
b. Black cards and diamonds Yes
c. Black cards and spades No
d. Diamonds and face cards No
e. Face cards and non-face cards Yes
f. Non-face cards and red cards No
36. Suppose I want to perform the random procedure of picking a card out of a standard
deck of cards. If ONE card is drawn, what are the following probabilities (please answer
using un-simplified fractions):
a. P(club) 13/52
b. P(not a heart) 39/52
c. P(face card) 16/52
d. P(red) 26/52
e. P(not a non-face card) 16/52
f. P(not black) 26/52
g. P(black or red) 52/52
37. One tie – dotted, striped, or solid – is selected at random, and then a shirt – white or
brown – is selected at random. What is the probability that a dotted tie AND white shirt
are selected?
a. 1/6
b. 1/2
c. 1/3
d. 3
e. None of these
38. What is the probability that you roll a die 4 times and get zero “6’s”? 0.482
39. What is the probability that you roll a die 4 times and get at least one “6”? 0.518
40. What is the probability that someone who has 3 children has exactly one girl (assume
no twins, triplets, or hermaphrodites)? 3/8
41. What is the probability that you flip a coin twice and get 2 tails? 1/4
42. What is the probability that you flip a coin 10 times and the seventh flip is heads? 1/2
The table below describes a standard deck of cards. Use the table to answer Question 43.
Note that I am counting aces as face cards.
Clubs (black)
Spades (black)
Hearts (red)
Diamonds (red)
Face Cards
4
4
4
4
Non-Face Cards
9
9
9
9
43. Suppose I want to perform the random procedure of picking a card out of a standard
deck of cards. If ONE card is drawn, what are the following probabilities (please answer
using un-simplified fractions):
a. P(face card and black) 8/52
b. P(red or non-face card) 44/52
c. P(face card or not black) 34/52
d. P(black and red) 0
e. P(club or face card) 25/52
f. Given the card is black, what is P(club)? 13/26
g. Given the card is black, what is P(face card)? 8/26
h. Given the card is a non-face card, what is P(face card)? 0
Use the data below to answer Question 44. This synthetic sample data (i.e. I made it up)
shows 1,000 people who either smoked or didn’t smoke, and who either died of lung
cancer or some other cause of death. Suppose I randomly sample one person from this
data.
Smoker
Non-Smoker
Lung Cancer Death
50
80
Non-Lung Cancer Death
150
720
44.
a. What is P(Smoker)? 200/1000
b. What is P(Lung Cancer Death)? 130/1000
c. What is P(Smoker given Lung Cancer Death)? 50/130
d. What is P(Non-Lung Cancer Death)? 870/1000
e. What is P(Non-Smoker)? 800/1000
f. What is P(Non-Lung Cancer Death given Non-Smoker)? 720/800
g. Is smoking and lung cancer death independent? No
45. Given P(A)=0.25, P(B)=0.60, and P(A and B)=0.10, find:
a. P(A or B) 0.75
b. P(B|A) 0.40
c. Are A and B independent (yes or no)? No
d. Are A and B mutually exclusive (yes or no)? No
46. One tie – dotted, striped, or solid – is selected at random, and then a shirt – white or
brown – is selected at random. What is the probability that a striped tie OR brown shirt is
selected?
a. 1/2
b. 2/3
c. 1/6
d. 5/6
e. None of these
47. Suppose TWO fair dice are rolled. Find the following probabilities:
a. P(both die are 1) 1/36
b. P(the sum of the dice is 6) 5/36
c. P(at least one of the dice is 4) 11/36
d. P(only ONE of the dice is 4) 10/36
48. Suppose TWO fair dice are rolled. Let E be the event of getting a “triple” (i.e. one die
is three times the other die) and let F be the event of getting a “sum of 6” (i.e. the two
dice add up to 6). Which one of the following statements is true: P(E)>P(F), P(E)=P(F),
or P(E)<P(F)?
49. Suppose a dresser drawer contains 20 individual socks where each sock is either
white or black (there is at least one of each color). Suppose you are blindfolded and you
start taking out socks from the drawer one by one. What is the MINIMUM number of
socks that you need to take out in order to GUARANTEE that you will have some
matching socks (i.e. 2 black socks OR 2 white socks). Three
50. For parts a-c below, state whether the pairs of events (events A and B) are dependent
or independent:
a. P(A)=0.60, P(B)=0.40, P(A and B)=0.24 Ind.
b. P(A)=0.90, P(B)=0.30, P(A and B)=0.18 Dep.
c. P(A)=0.50, P(B)=0.70, P(A and B)=0.25 Dep.
51. Suppose a jar contains 40 red marbles, 40 blue marbles and 20 green marbles (100
marbles total). If TWO marbles are drawn WITHOUT REPLACEMENT from the jar
(that is, one marble is drawn and NOT put back into the jar, and then another marble is
drawn), what are the following probabilities?
a. P(both are green) (i.e. the first marble is green AND the second marble is green)
(20/100)*(19/99)
b. P(neither are green) (80/100)*(79/99)
c. P(first marble is red) 40/100
d. P(first marble is red, second marble is blue) (40/100)*(40/99)
e. P(both marbles are neither red nor green) (40/100)*(39/99)
f. P(first marble is red, second marble is green) (40/100)*(20/99)
g. Given the first marble is red, what is P(second marble is red)? 39/99
h. Given the first marble is green, what is P(second marble is blue)? 40/99
52. If TWO cards are drawn WITH REPLACEMENT from a standard deck of cards (that
is, the first card is put back into the deck (and the deck is shuffled) before the second card
is drawn), what are the following probabilities?
a. P(both are black) (i.e. the first card is black AND the second card is black)
(26/52)*(26/52)
b. P(first card drawn is red, second card drawn is black) (26/52)*(26/52)
c. P(both cards are neither red nor face cards) (18/52)*(18/52)
d. P(first card drawn is a red face card, second card drawn is red) (8/52)*(26/52)
e. Given the first card drawn is a red card, what is P(second card is red)? 26/52
f. Given the first card drawn is a club face card, what is P(second card is a diamond face
card)? 4/52
53. Give the probabilities for Questions #52A-F assuming the two cards are drawn
WITHOUT REPLACEMENT (that is, one card is drawn and NOT put back into the
deck, and then another card is drawn) and the deck is shuffled after replacement.
a. (26/52)*(25/51)
b. (26/52)*(26/51)
c. (18/52)*(17/51)
d. (8/52)*(25/51)
e. 25/51
f. 4/51
54. In Question #52, does the probability of the second draw depend on the first draw?
No
55. In Question #53, does the probability of the second draw depend on the first draw?
Yes
56. A large department store has 500 employees. There are 350 females and 200 of them
are under the age of 25. There are 75 males under 25. If one employee is randomly
selected, what are the following probabilities:
a. P(under 25 or female) 425/500
b. P(over 25 or female) 425/500
c. P(male or over 25) 300/500
57. There are 6 green hats, 4 blue hats and 3 red hats in a box. You randomly select one
hat. What are the following probabilities:
a. P(blue or red) 7/13
b. P(not green) 7/13
c. P(green or blue or red) 1
58. In a class of 50 students, 18 take chorus, 26 take band, and 2 take both. Answer the
following questions:
a. How MANY are only in chorus? 16
b. How many are only in band? 24
c. How many take neither? 8
d. How many take either band or chorus (but NOT both)? 40
59. Does the table below represent a valid probability distribution?
x
-3
-1.56
2
5.7
10,002
P(x)
0.20
0.10
0.05
0.56
0.09
Yes
60. Does the table below represent a valid probability distribution?
X
4
6
8
9
P(x)
-0.50
0.60
0.50
0.40
No
61. Does the table below represent a valid probability distribution?
X
0
1
P(x)
0.45
0.65
No
Suppose a random procedure that yields the following outcomes and probabilities. Use
this table to answer Questions 62-63.
X
80
100
150
200
250
P(x)
0.24
0.22
0.31
0.18
0.05
62. Find the mean (expected value) and standard deviation of this distribution.
Mean=136.2; Std. Dev.=49.9
63. What are the following probabilities:
a. P(X<=150) 0.77
b. P(X=200) 0.18
c. P(X<70) 0
d. P(X=100 or X=200) 0.40
e. P(X=100 and X=250) 0
f. P(X<200 or X>80) 1
64. Suppose I have a distribution where one third of the time the value equals -1, one
third of the time the value equals 0, and one third of the time the value equals 2. Is this a
valid probability distribution?
Yes
65. Suppose I have a distribution where one half of the time the value equals 0.4, and two
thirds of the time the value equals 0.6. Is this a valid probability distribution?
No
Use the following table to answer Questions 66-67:
X
0
1
2
3
10
P(x)
0.0
0.3
0.3
0.3
0.1
66. Why is this probability distribution valid? The probabilities sum to 1, and each
individual probability is between 0 and 1, inclusive.
67. Find the expected value and standard deviation of this distribution.
Mean=2.8; Std. dev.=2.5
68. A carnival game offers a $100 cash prize for anyone who can break a balloon by
throwing a dart at it. It costs $5 to play. You estimate that you have a 10% chance of
hitting the balloon on any throw. Find your expected winnings.
$5
69. (De Veaux et al. 2009) A commuter must pass through 5 traffic lights on her way to
work and will have to stop at each one that is red. She estimates the probability model for
the number of red lights she hits as shown below.
X=# of red
0
1
2
3
4
5
0.05 0.25 0.35 0.15 0.15 0.05
P(x)
How many red lights should she expect to hit each day? 2.25 red lights
70. An insurance policy has the following pay offs. If you die, your survivor gets
$10,000. If you become disabled, you get $5000. Otherwise, you receive nothing. The
policy costs $50 a year. Based on past data, the probability a person dies is .01 and the
probability the person becomes disabled is .02. Find the expected value from your point
of view.
$150
71. (De Veaux et al. 2009) You roll a die. If it comes up 6, you win $100. If not, you get
to roll again. If you get a 6 the second time, you win $50. If not, you lose. Create the
probability model and find the expected amount you’ll win.
$23.61
72. A game costs $5 to play. You draw a card from a deck of cards. If you draw the ace
of hearts, you win $100. For any other ace, you get $10 and for any other heart you get
$5. If you draw anything else, you lose. Find the average winnings or losses for this
game.
-$1.35
73. Suppose you visit Las Vegas and decide to play roulette. If you bet $5 that the
outcome is a number between 1-12 (including 1 and 12), you have a 26/38 probability of
losing your $5 bet, and you have a 12/38 probability of making a net gain of $10
(equaling the $15 prize minus your $5 bet). Only considering NET winnings/losses, what
is your expected value of betting on a number between 1-12 (round to the nearest cent)?
-$0.26
74. A man buys a racehorse for $20,000 and enters it in two races. He plans to sell the
horse afterwards hoping to make a profit. If the horse wins both races, it will sell for
$100,000. If it wins only one race, it will be worth $50,000. If it loses both races, it will
be worth $10,000. The man believes there is a 20% that the horse will win the first race
and a 30% chance that it will win the second race. Assuming the two races are
independent events, find the man’s expected profit.
$10,600
75. Suppose the following binomial probability situation: A certain statistics class has 15
students, and the probability that a given student will pass the class is 0.8. Find the
following probabilities:
a. P(everybody passes) 0.035
b. P(at least 10 students pass) 0.939
c. P(4 students fail) 0.188
d. P(11 or 12 students pass) 0.438
e. P(at most 2 students fail) 0.398
76. Suppose the following binomial probability situation: You draw a card out of a
shuffled deck of cards 10 times (replacing the card after each draw and re-shuffling) and
count the number of red cards you draw (note there are 26 red cards and 52 cards total).
Find the following probabilities (3 decimal places):
a. P(6 red cards) 0.205
b. P(3 black cards) 0.117
c. P(at most 5 red cards) 0.623
d. P(more than 7 black cards) 0.055
77. A moving target at a police academy target range can be hit 80% of the time by a
particular individual. Suppose the person takes three shots at the target. What is the
probability that:
a. There are exactly two hits? 0.384
b. There are hits on all three? 0.512
c. There is only one hit? 0.096
d. There are misses on all three? 0.008
e. There is at least one hit? 0.992
78. A quality control inspector has drawn a sample of 13 light bulbs from a recent
production lot. If the number of defective bulbs is 2 or less, the lot passes inspection.
Suppose 10% of the bulbs in the lot are defective. What is the probability that the lot will
pass inspection? 0.866
79. Suppose the following binomial probability situation: Suppose Dr. Coldren was a
single male. Further suppose that there was a week in the distant past (Sunday-Saturday)
where he asked a different supermodel for a date (for that evening) each day of the week.
Suppose the probability that any given supermodel said “yes” was 0.20. Assume a
supermodel agreeing to a date was a “success”, and not agreeing to a date was a “failure”
(meaning I stayed home alone for the evening). Find the following probabilities:
a. P(Dr. Coldren stayed home alone all week) 0.210
b. P(Dr. Coldren stayed home alone at least one evening) 0.9999872
c. P(Dr. Coldren had a date with a supermodel every evening of the week) 0.0000128
d. P(Dr. Coldren was home alone an odd number of evenings) 0.514
80. Public health statistics indicate that 26.4% of American adults smoke. Describe the
sampling distribution for a sample of 50 adults.
Under certain assumptions, the sampling distribution of the sample proportions will
be normally distributed with mean p=0.264 and std. dev. = sqrt((pq/n)) = sqrt(
(0.264*0.736) / 50 ) = 0.062.
81. Assume that 30% of the students at a certain community college wear contact lenses
and we randomly pick 100 students to see what percentage of them wear contacts.
Describe this sampling distribution. What is the probability that more than one third of
them wear contacts?
Under certain assumptions, the sampling distribution of the sample proportions will
be normally distributed with mean p=0.30 and std. dev. = sqrt((pq/n)) = sqrt(
(0.30*0.70) / 100 ) = 0.0458.
Probability = 0.233
82. It is believed that 4% of children have a gene that may be linked to juvenile diabetes.
Researchers hoping to track 20 of these children for several years test 732 newborns for
the presence of this gene. What’s the probability they find enough subjects for the study?
0.96
83. A restaurateur anticipates serving 180 people on a Friday evening and believes that
about 20% of the patrons will order the steak special. How many of those specials should
he plan on ordering in order to be 95% sure (i.e. only a 5% chance of running out of
food) of having enough steaks on hand to meet customer demand?
45 Steaks
84. A college’s data about the incoming freshmen indicates that the mean of their high
school GPAs is 3.4 with a standard deviation of 0.35. The distribution is normal. The
students are randomly assigned to freshmen writing seminars in groups of 25.
a. Find the probability a given student has a GPA greater than 3.5. 0.39
b. Find the probability that one of the groups has an average GPA greater than 3.5. 0.08
85. Ithaca, New York gets an average of 35.4” of rain each year with a standard deviation
of 4.2”. Assume the Normal model applies to their yearly rainfall.
a. What percentage of years does Ithaca get more than 40” of rainfall? 0.14
b. What rainfall amount separates the “driest” 20% of years from the “wettest” 80%? 31.9
Inches
c. Suppose you live in Ithaca for four consecutive years. What is the probability that
those four years average less than 30” of rain? 0.005
86. Suppose the weights of men are normally distributed with a population mean of 180
pounds and a population standard deviation of 20 pounds. Suppose a crew of 10 men are
about to board a fishing boat. Further suppose the boat can safely carry 10-person crews
weighing less than 1900 pounds total (i.e. safely carry 10-person crews where the average
crew member weighs less than 190 pounds). Suppose the above-mentioned 10 male crew
members were randomly sampled from the overall population of men. Use this
information to answer the following:
a. What is the probability that any one of the crew members weighs more than 190
pounds? 0.31
b. What is the probability that the entire crew weighs more than 1,900 pounds – and
hence a catastrophe is likely to occur? Hint: In other words, what is the probability that
the AVERAGE weight of the crew members is more than 190 pounds? 0.06
87. A poll found that 50% of a random sample of 1012 American adults said that they
believe in ghosts.
a. Find the margin of error for this poll if we want 90% confidence in our estimate of
American adults who believe in ghosts. E=0.02585
b. Explain what a “90% confidence interval” means and find the interval. We can be
90% confident that the true population proportion (i.e. the percentage of American
adults who believe in ghosts) is contained in the following CI: (0.474,0.526)
c. If we want to be 99% confident, will the margin of error be larger or smaller? Larger
d. Find that margin of error. E=0.04049
e. In general, will smaller margins of error involve greater or less confidence in the
interval? Less
88. (De Veaux et al. 2009) Direct mail advertisers send solicitations to thousands of
potential customers in the hope that some will buy the company’s product. The response
rate usually is quite low. Suppose a company wants to test the response to a new flyer and
sends it to 1000 people randomly selected from their mailing list of over 200,000 people.
They get 123 orders from the recipients.
a. Create a 90% confidence interval for the percentage of people the company contacts
who may buy something. (10.6%, 14.0%)
b. Explain what the interval means. We can be 90% confident that the true proportion
of the company’s 200,000 customers who will actually purchase an item is in the
above interval.
c. The company must decide whether to now do a mass mailing. The mailing won’t be
cost effective unless it produces at least a 5% return. What does your confidence interval
suggest? They should do the mass mailing.
89. A national health organization warns that 30% of the middle school students
nationwide have been drunk. Concerned, a local health agency randomly and
anonymously surveys 110 of the 1212 middle school students in its city. Only 21 of them
reported having been drunk.
a. What proportion of the sample reported having been drunk? 21/110 = 0.191
b. Does this mean that this city’s youth are not drinking as much as the national data
would indicate? Not necessarily – we need to build a confidence interval.
c. Create a 95% confidence interval for the proportion of the city’s middle school
students who have been drunk. (11.7%, 26.4%)
d. Is there any reason to believe that the national level of 30% is not true of the middle
school students in this city? Yes – even the upper bound of the above confidence
interval is below 30%.
90. In preparing a report on the economy, we need to estimate the percentage of
businesses that plan to hire additional employees in the next 60 days.
a. How many randomly selected employers must we contact in order to create an estimate
in which we are 98% confident with a margin of error of 5%? 542
b. Suppose we want to reduce the margin of error to 3%. What sample size will suffice?
1504
c. Why might it not be worth the effort to try to get an interval with a margin of error of
only 1%? Because it would take a sample size of 13,530. That is a mighty big (i.e.
expensive) number.
91. Write the null and the alternative hypotheses for the following:
a. In the 1950’s only about 40% of high school graduates went on to college. Has the
percentage changed?
H0: p=0.4
HA: p≠0.4
b. 20% of the cars of a certain model have needed costly transmission work after being
driven between 50,000 and 100,000 miles. The manufacturer hopes that the redesign of
the transmission has solved this problem.
H0: p=0.20
HA: p<0.20
c. We field test a new flavor of soft drink, planning to market it only if we are sure that at
least 60% of the people like the flavor.
H0: p=0.60
HA: p>0.60
d. The drug Lipitor is meant to lower cholesterol. Is there evidence to support the claim
that over 1.9% of the users experience flu like symptoms as a side effect?
H0: p=0.019
HA: p>0.019
e. According to the US department of Health, 16.3% of Americans did not have health
insurance coverage in 1998. A politician claims that this percentage has decreased since
1998.
H0: p=0.163
HA: p<0.163
f. During the past forty years, the monthly rate of return for a particular item has been 4.2
percent. A store analyst claims that it is different.
H0: p=0.042
HA: p≠0.042
92. In the 1980’s it was generally believed that autism affected about 6% of the nation’s
children. Some people believe that the increase in the number of chemicals in the
environment has led to an increase in the incidence of autism. A recent study examined
384 children and found that 46 of them showed signs of some form of autism. Is there
strong evidence that the level of autism has increased (Let alpha=0.05)? Write the
hypotheses, check the assumptions, draw the curve, find the pertinent statistics and
critical values, find the p value, state your conclusion, etc.
H0: p=0.06
HA: p>0.06
Test statistic = z = 4.93
P-value = 0.0000004 (i.e. a really small #)
Conclusion: Since the P-value is less than alpha, we can reject the hypothesis that
the true population proportion of kids with autism is 6%. The statistical evidence
indicates that the true rate of autism has probably increased.
93. During the 2000 season, the home team won 138 of the 240 regular season games. Is
this strong evidence of a home field advantage? (Let alpha=0.05)
H0: p=0.50
HA: p>0.50
Test statistic = z = 2.32
P-value = 0.01
Conclusion: Since the P-value is less than alpha, we can reject the hypothesis that
there is no home field advantage. Therefore, the statistical evidence suggests that
there IS a home field advantage.
94. A personal trainer wanted to know whether the proportion of males 30 to 44 years old
who do not exercise has decreased from 24.9%, the proportion in 1998. He randomly
selects 150 males in that age group and finds that 28 of them do not exercise. Is there
significant evidence that the proportion of males in this age group that do not exercise has
decreased (Let alpha=0.05)?
H0: p=0.249
HA: p<0.249
Test statistic = z = -1.77
P-value = 0.039
Conclusion: Since the P-value is smaller than alpha, we reject the hypothesis that
the true percentage of men who don’t exercise is 24.9%. Thus, the statistical
evidence suggests that men of this age group are exercising more (i.e. less are
NEVER exercising).
95. A survey of 430 randomly selected adults found that 21% of the 222 men and 18% of
the 208 women had purchased books online. Is there evidence that men are more likely to
make online purchases of books? Use an alpha level of 0.05.
H0: p1=p2
HA: p1>p2
Note that in this problem p1 refers to the men.
Test statistic = z = 0.88
P-value = 0.19
Conclusion: Since the P-value is larger than alpha, we cannot reject the hypothesis
that the true percentage of men and women who purchase books online are equal.
The statistical evidence suggests that men don’t appear to be more likely to make
online book purchases.
96. Would being part of a support group that meets regularly help people who are
wearing the nicotine patch actually quit smoking? A county health department tries an
experiment using several hundred volunteers who are planning to use the patch. The
subjects were randomly divided into two groups. People in Group 1 were given the patch
and attended a weekly discussions meeting with counselors and others trying to quit.
People in Group 2 also used the patch but did not participate in the counseling groups.
After six months 46 of the 143 smokers in Group 1 and 30 of the 151 smokers in Group 2
had successfully stopped smoking. Do these results suggest that such support groups
could be an effective way to help people stop smoking? Use an alpha level of 0.05.
H0: p1=p2
HA: p1>p2
Note that in this problem p1 refers to Group 1 (i.e. the group who get counseling).
Test statistic = z = 2.41
P-value = 0.008
Conclusion: Since the P-value is smaller than alpha, we reject the hypothesis that
the true percentage of people who quit smoking is equal. The statistical evidence
suggests that the counseling is beneficial to quitting smoking.
97. When games were sampled from throughout a season, it was found that the home
team won 127 of 198 professional basketball games, and the home team won 57 of 99
professional football games. Based on these results, does there appear to be a significant
difference between the proportions of home wins for the two sports? What can we
conclude about home field advantage for these two sports? Do hypothesis test with
alpha=0.05 (Triola 2008).
H0: p1=p2
HA: p1≠p2
In this problem p1 will refer to basketball.
Test statistic = z = 1.10
P-value = 0.27
Conclusion: Since the P-value is larger than alpha, we cannot reject the hypothesis
that the true percentage of home games won in basketball vs. football is the same.
The statistical evidence suggests that there doesn’t appear to be MORE of a homefield advantage for one sport over the other. NOTE THAT WE DID NOT TEST
WHETHER EITHER OF THE SPORTS HAS A HOME-FIELD ADVANTAGE.
We just tested whether one sport has MORE of a home-field advantage.
98. A gender selection methodology called “XSORT” yielded the following results for
parents who WANTED a girl: 295 out of 325 babies born using the method were girls.
For those parents who wanted a boy (the “YSORT” method was used for these parents),
39 out of 51 babies were boys. Perform a hypothesis test with alpha=0.05 for the
difference between the proportions of boys and girls being born using these gender
selection methodologies (Triola 2008).
H0: p1=p2
HA: p1≠p2
In this problem p1 will refer to girls.
Test statistic = z = 3.01
P-value = 0.003
Conclusion: Since the P-value is smaller than alpha, we reject the hypothesis that
the two population proportions are the same. The statistical evidence suggests that
the XSORT (girl) methodology is superior to the YSORT (boy) methodology. NOTE
THAT WE DIDN’T TEST WHETHER EITHER OF THEM IS EFFECTIVE . . .
WE JUST COMPARED THE TWO METHODS.
99. During an angiogram, heart problems can be examined via a small tube (a catheter)
threaded into the heart from a vein in the patient’s leg. It’s important that the company
who manufacturers the catheter maintain a diameter of 2.00 mm. Each day, quality
control makes several measurements to test the 2.00 mm standard. What would Type I
and II errors be?
H0: µ=2
HA (2 sided): µ≠2
HA (1 sided): µ<2 or µ>2
Type I: Everything is really okay, but production is stopped (this costs $$$)
Type II: The catheters are faulty, but production continues and patients die.
100. Suppose the elapsed time of airline itineraries between Washington, D.C. and
Boston is normally distributed with an unknown population mean and an unknown
population standard deviation. Further suppose that a sample of size 25 (therefore, n=25
and degrees of freedom=24) was taken and the following statistics were gotten from the
sample: sample mean (ybar) =135 and sample standard deviation (s) = 40. Construct
confidence intervals around the sample mean corresponding to the following confidence
levels (express the lower and upper bounds of the intervals as INTEGERS):
a. 80%
b. 90%
c. 95%
d. 98%
e. 99%
(124, 146)
(121, 149)
(118, 152)
(115, 155)
(113, 157)
Hint: Your intervals should get wider and wider and should all be centered around 135.
101. Suppose the elapsed time of airline itineraries between Washington, D.C. and
Boston is normally distributed with an unknown population mean and unknown
population standard deviation. Suppose we randomly sample 25 itineraries and the
sample average is calculated to be 135 minutes and the sample standard deviation is
calculated to be 40. Further suppose that we want to test the hypothesis that the true
population mean (μ) equals 150 minutes. Conduct 2-sided hypothesis tests with the
following alpha levels:
a. 0.20 Reject
b. 0.10 Reject
c. 0.05 Don’t Reject
d. 0.02 Don’t Reject
e. 0.01 Don’t Reject
H0: µ=150
HA: µ≠150
Test statistic (t) = -1.875
P-value = 0.0730
Conclusions: See above
102. (De Veaux et al. 2009) Hoping to lure more shoppers downtown, a city builds a new
public parking garage. The city plans to pay for the structure through parking fees.
During a two month period (44 week days) daily fees collected averaged $126 with a
standard deviation of $15. If a consultant claimed that the average daily income would be
$130, should we reject her claim using alpha=0.10 (perform a 2-sided test)?
H0: µ=130
HA: µ≠130
Test statistic (t): -1.77
P-value: 0.08
Conclusion: Reject the null. It is likely that the consultant’s claim is false.
103. In 1998, the Nabisco Company announced a “1000 Chips Challenge” claiming that
every 18 ounce bag of Chips Ahoy contained at least 1000 chocolate chips. Below are the
counts of chips in selected bags.
1219
1132
1214
1191
1087
1270
1200
1295
1419
1135
1121
1325
1345
1244
1258
1356
Perform a one-sided test (HA: µ>1000). What does this evidence say about Nabisco’s
claim (let alpha=0.05)?
H0: µ=1000
HA: µ>1000
Test statistic (t) = 10.1
P-Value = tiny #
Conclusion: We can reject the null hypothesis. It appears that Nabisco’s claim is
valid.
104. When consumers apply for credit, their credit is rated using FICO scores. A random
sample of credit ratings is obtained, and the FICO scores are summarized with these
statistics: n=25, ybar=680, s=22. Use an alpha of 0.01 and do a 1-sided hypothesis test to
test the claim that the mean credit score (of the general population) is less than 700
(Triola 2008).
H0: µ=700
HA: µ<700
Test statistic (t) = -4.54
P-Value = 0.00006
Conclusion: We reject the null hypothesis. There is enough statistical evidence to
conclude that the mean is probably less than 700.
105. Different cereals are randomly selected, and the sugar content is obtained for each
cereal, with the results given below for Cheerios, Harmony, Smart Start, Cocoa Puffs,
Lucky Charms, Corn Flakes, Fruit Loops, Wheaties, Cap’n Crunch, Frosted Flakes,
Apple Jacks, Bran Flakes, Special K, Rice Krispies, Corn Pops, and Trix. Use an alpha of
0.05 to test the claim of a cereal lobbyist that the mean of all cereals is LESS than 0.3 g
(Triola 2008).
0.03
0.44
0.24
0.39
0.30
0.48
0.47
0.17
0.43
0.13
0.07
0.09
0.47
0.45
0.13
0.43
H0: µ=0.3
HA: µ<0.3
Test statistic (t): -0.12
P-value: 0.45
Conclusion: Fail to reject the null. The statistical analysis does not back up the
lobbyist’s claim.
106. A study was conducted to assess the effects that occur when children are exposed to
cocaine before birth. 190 children born to cocaine users had a mean score of 7.3 (with a
standard deviation of 3.0) on a certain aptitude test. 186 children not exposed to cocaine
had a mean score of 8.2 with a standard deviation of 3.0. Use an alpha of 0.05 to test the
claim that cocaine use is harmful to children’s aptitude (Triola 2008).
H0: µ1=µ2
HA: µ1<µ2
Test statistic (t) = -2.91
P-value = 0.002
Conclusion: Reject the null. We can conclude with reasonable certainty that cocaine
is bad for children. Moral: Don’t use cocaine when you are pregnant.
107. Use the following data (representing hospital admissions from motor vehicle
crashes) and an alpha of 0.05 to test the claim that Friday the 13ths are unlucky (Triola
2008):
Friday the 6th
(immediately preceding the 13th)
9
6
11
11
3
5
H0: µd=0
HA: µd>0
Test Statistic (t) = 2.71
Friday the 13th
13
12
14
10
4
12
P-Value = 0.02
Conclusion: Since the P-Value is less than alpha, we can reject the null. The
statistical evidence appears to show that Friday the 13th is indeed unlucky (more
data is probably needed to conclusively show that Friday the 13ths are unlucky).
108. A study was conducted to investigate the effectiveness of hypnotism in reducing
pain. Results for randomly selected subjects are given below. The measurements
represent a pain scale (where higher #’s indicate more pain). Use an alpha of 0.05 to test
the claim that hypnosis lowers pain.
Before Hypnosis
6.6
6.5
9.0
10.3
11.3
8.1
6.3
11.6
After Hypnosis
6.8
2.4
7.4
8.5
8.1
6.1
3.4
2.0
H0: µd=0
HA: µd>0
Test Statistic (t) = 3.04
P-Value = 0.009
Conclusion: Since the P-Value is less than alpha, we can reject the null. The
statistical evidence appears to show that the hypnosis treatment is effective in
reducing pain.
109. To test the effectiveness of a drug to relieve asthma, a group of subjects was
randomly given a drug and placebo on two different occasions. After 1 hour an asthmatic
relief index was obtained for each subject, with these results:
Use 0.05 for alpha. Is the drug effective (Hint: low numbers are good!)?
Subject
Drug
Placebo
1
28
32
2
31
33
3
17
19
4
22
26
5
12
17
6
32
30
7
24
26
8
18
19
9
25
25
H0: µd=0
HA: µd>0
Test Statistic (t) =2.75
P-Value = 0.01
Conclusion: Since the P-Value is less than alpha, we can reject the null. The
statistical evidence appears to confirm that the drug is effective.
110. Here is a table showing who survived the sinking of the Titanic based on whether
they were crew members or passengers booked in first, second or third-class staterooms:
Alive
Dead
Total
Crew
212
673
885
First
203
122
325
Second
118
167
285
Third
178
528
706
Total
711
1490
2201
Determine if surviving was independent of cabin status (use alpha=0.01).
H0: Cabin class and survivorship are independent.
HA: Cabin class and survivorship are dependent.
Test Statistic (X2) = 190.4
P-Value = Tiny, tiny #
Conclusion: Since the P-value is smaller than alpha, we reject the null hypothesis.
Therefore, we can conclude that there is an association (dependence) between cabin
class and survivorship.
111. Use the following data to do a test of independence to see if left-handedness is
independent of gender (use alpha=0.05):
Male
Female
Left-Handed
17
16
Right-Handed
83
184
H0: “Handedness” and gender are independent.
HA: Handedness and gender are dependent.
Test Statistic (X2) = 5.52
P-Value = 0.019
Conclusion: Since the P-value is smaller than alpha, we reject the null hypothesis.
Therefore, we can conclude that there is an association (dependence) between
handedness and gender.
112. Use the following data to do a test of independence to see if height is independent of
gender (use alpha=0.05):
Male
Female
Short
3
17
Tall
25
2
H0: Height and gender are independent.
HA: Height and gender are dependent.
Test Statistic (X2) = 28.72
P-Value = Small #
Conclusion: Since the P-value is smaller than alpha, we reject the null hypothesis.
Therefore, we can conclude that there is an association (dependence) between height
and gender.
113. A die is filled with a lead weight and then rolled 200 times with the following
results:
1: 27
2: 31
3: 42
4: 40
5: 28
6: 32
Use an alpha of 0.05 to test the claim that the outcomes are not equally likely (Triola
2008).
H0: The die rolls are evenly distributed over 1-6.
HA: The die rolls are not evenly distributed over 1-6 (i.e. the die is not “fair”).
Test Statistic (X2) = 5.86
P-Value = 0.32
Conclusion: Since the P-value is greater than alpha, we cannot reject the null
hypothesis. Therefore, there is not enough statistical evidence to support the claim
that the die is not fair.
114. The following data lists automobile fatalities by day of week:
Sun: 132
Mon: 98
Tue: 95
Wed: 98
Thu: 105
Fri: 133
Sat: 158
Use an alpha of 0.05 to test the claim that the outcomes are not uniformly spread across
the days of the week (Triola 2008).
H0: Automobile fatalities are evenly distributed over all seven days of the week.
HA: Automobile fatalities are not evenly distributed over all seven days of the week.
Test Statistic (X2) = 30.02
P-Value = Small #
Conclusion: Since the P-value is less than alpha, we can reject the null hypothesis.
Therefore, the statistical evidence suggests that driving fatalities are not evenly
spread throughout the days of the week.
115. The following data lists the birth months of Oscar-winning actors:
Jan: 9
Feb: 5
Mar: 7
Apr: 14
May: 8
Jun: 1
Jul:7
Aug: 6
Sep: 4
Oct: 5
Nov: 1
Dec: 9
Use an alpha of 0.05 to test the claim that the outcomes are not uniformly spread across
the months (Triola 2008).
H0: Actor birth months are uniformly spread out over all 12 months.
HA: Actor birth months are not uniformly spread out over all 12 months.
Test Statistic (X2) = 22.54
P-Value = 0.02
Conclusion: Since the P-value is less than alpha, we can reject the null hypothesis.
Therefore, the statistical evidence suggests that actor birth months are NOT evenly
spread out throughout the year.
116. You are planning to open an old time soda fountain and your partner claims that the
public will not prefer any flavor over another. The flavors you serve are cherry,
strawberry, orange, lime and grape. After several customers, you stop and take a look at
how sales are going and here are the results. The following numbers of people ordered
the flavor shown. Cherry 35, Strawberry 32, Orange 29, Lime 26 and Grape 25. Test to
see if there was a preference at the 0.05 significance level.
H0: The customers have no preference for any flavor.
HA: The customers have flavor preferences.
Test Statistic (X2) = 2.35
P-Value = 0.67
Conclusion: Since the P-value is larger than alpha, we cannot reject the null
hypothesis. Therefore, there is no statistical evidence that customers prefer some
flavors more than others.