Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

Document related concepts

Transcript

MATH-138 In-class Practice Problems 1. Suppose a basketball player scored the following number of points in his last 15 games: 4, 4, 3, 4, 7, 16, 12, 23, 15, 8, 5, 18, 8, 29, 21. Fill in the following frequency (and relative frequency) distribution. Bin 1-6 7-12 13-18 19-24 25-30 Total Frequency 5 4 3 2 1 15 Relative Frequency 33% 27% 20% 13% 7% 100% 2. a. What percentage of games did the player score 12 points or less? 60% b. What percentage of games did the player score between 7 and 18 points (inclusive i.e. 7<=points<=18)? 47% 3. If you were to draw a histogram from your frequency distribution (from Question 1), would it be skewed to the right or left? That is, is this distribution skewed right or left? Right 4. Calculate the following statistics from the basketball scores: Mean, Median, Quartile 1, Quartile 3, Minimum, Maximum, Range, IQR, and Standard Deviation. Mean= 11.8 Median= 8 Standard Deviation = 8.2 Minimum = 3 Q1 = 4 Q3 = 18 Maximum = 29 Range = 26 IQR = 14 5. Construct a boxplot for the above-mentioned basketball scores. 6. Use the above-mentioned basketball scores to calculate the z-scores for the 3 lowestscoring and 3 highest-scoring games. Lowest 3 scoring games: 3 -> -1.07; 4 -> -0.95; 4 -> -0.95 Highest 3 scoring games: 21 -> 1.12; 23 -> 1.37; 29 -> 2.10 7. A college student received a score of 78 on her Math exam and a score of 86 on her French exam. The overall results on the French exam had a mean of 82 and a standard deviation of 8, while the math exam had a mean of 54 and a standard deviation 12. On which exam did she do relatively better? Math z-score: 2.0 French z-score: 0.5 She did relatively better on her math exam. 8. Who is relatively taller: A. A non-basketball playing man who is 75 inches tall (assume non-basketball playing men have a mean height of 71.5 inches tall and a standard deviation of 2.1 inches). B. A male basketball player who is 85 inches tall (assume male basketball players have a mean height of 80 inches and a standard deviation of 3.3) The non-basketball playing man is relatively taller (his z-score is 1.67 vs. the basketball player’s z-score of 1.51). 9. Assume verbal SAT scores have a mean of 500 and a std. dev. of 100. What is the zscore of somebody who scores 500 on the verbal portion of the SAT? 0 10. Assume IQ scores have a mean of 100 and a std. dev. of 16. Albert Einstein reportedly had an IQ of 160. What is the z-score of his IQ? 3.75 Assume IQ scores have a normal model distribution with a mean of 100 and a std. dev. of 16. Use this information to answer Questions 11-14. 11. Find the following percentages: a. % of people with 84<=IQ<=116 68% b. % of people with IQ>=100 50% c. % of people 68<=IQ<=132 95% d. % of people who are “geniuses” (a genius is someone with an IQ>=132) 2.5% e. % of people 84<=IQ<=132 81.5% 12. Find the following percentage: a. % of people with IQ<=125 94.1% b. % of people with 90<=IQ<=110 46.8% c. % of people with 110<=IQ<=120 16.0% 13. What value (IQ score) separates the bottom/lower/not-so-smart 10% of the population from the top/upper/smarter 90%? 79 14. What value (IQ score) separates the top/smarter 35% of the population from the bottom/not-so-smart 65%? 106 Suppose the IQ scores of people who smoke copious amounts of pot are normally distributed with mean=90 and standard deviation=20 (assume IQ scores are continuous and not necessarily integers). Suppose the IQ scores of people who don’t smoke copious amounts of pot are normally distributed with mean=100 and standard deviation=16. Use this information to answer Questions 15-18. 15. Find the following percentages: a. A former psychological classification of mental retardation labeled someone with an IQ score between 50 and 68 as a “moron”. What percentage of copious pot smokers are “morons”? 11.3% b. What percentage of non-copious pot smokers are NOT “morons”? 97.8% c. What percentage of copious pot smokers have an IQ score within 1 standard deviation of the mean? 68% d. What percentage of non-copious pot smokers have IQ scores within 2 standard deviations of the mean? 95% 16. Find the following percentages: a. The percentage of non-copious pot smokers who have an IQ score higher than the mean of copious pot smokers? 73.4% b. The percentage of non-copious pot smokers who are geniuses (a “genius” is somebody with an IQ of 132 or higher)? 2.3% (2.5% is an acceptable answer) 17. What is the IQ score of the following people (round to the nearest integer): a. A copious pot smoker who is smarter than 90% of all the other copious pot smokers 116 b. A non-copious pot smoker who is dumber than 90% of all the other non-copious pot smokers 79 18. What IQ score will separate the smarter half of the copious pot smokers from the dumber half? 90 19. The following data represents movie budgets vs. gross revenue (in million $) for 7 movies. Create a scatterplot to see if r should be calculated. If so, what is r (Triola 2008)? Budget 62 90 50 35 200 100 90 Gross 65 64 48 57 601 146 47 There appears to be a positive, linear relationship. The value of r, 0.93, confirms this. 20. The following data represents supermodel heights (inches) vs. weights (pounds) for 9 supermodels. Create a scatterplot to see if r should be calculated. If so, what is r (Triola 2008)? Height 70 70.5 68 65 70 70 70 70 71 Weight 117 119 105 115 119 127 113 123 115 The scatterplot does not show a linear relationship. Therefore, r does not need to be calculated. If you were to calculate r, you would see that it equals 0.36. 21. a. Estimate the regression equation for Question #19 (use budget as your independent, “x” variable) Predicted Revenue = (3.47*Budget)-164.14 b. Interpret the slope and intercept in the context of this problem The model predicts that for each $1 million increase in the movie budget, revenue will increase by $3.47 million. c. How much gross revenue does the regression line predict a movie with a $95 million budget will make? $165.5 million d. What is the residual for the movie in the data that had a $100 million budget? -$37 million 22. A basketball player makes 80% of her free throws. Suppose she wakes up every morning and starts shooting free throws. On an “average” morning, on which free throw will she have her first miss? Perform 20 trials. For this simulation we will generate random #’s from 1-5. Call any # from 1-4 a “made shot” and a 5 a “missed shot”. For each trial, generate #’s until she “misses” a shot. Note: the below trials were gotten using my calculator. If/when you repeat this experiment, you will get different results (since we are dealing with random #’s). Trial 1: 5 (She missed on her first shot) Trial 2: 1, 3, 3, 4, 1, 2, 5 (She missed on her sixth shot) Trial 3: 2, 4, 5 (She missed on her third shot) Trial 4: 2, 2, 1, 5 (She missed on her fourth shot) Trial 5: 1, 1, 3, 5 (She missed on her fourth shot) Trial 6: 5 (She missed on her first shot) Trial 7: 2, 2, 1, 1, 4, 1, 3, 2, 5 (She missed on her ninth shot) You would want to do at least 30 trials and then get the average # of times it took her to miss. In the above example (with 7 trials), on average, she missed on her 28/7 = 4th shot. 23. You are going to take a quiz with 5 multiple choice questions. You estimate that you have a 80% chance of getting any question right. What are you chances of getting them all right? Perform 20 trials. For this simulation we will generate random #’s from 1-5. Call any # from 1-4 a “correctly answered question” and a 5 an “incorrectly answered question”. For each trial, generate 5 #’s (corresponding to the 5 multiple choice questions). Note: the below trials were gotten using my calculator. If/when you repeat this experiment, you will get different results (since we are dealing with random #’s). Trial 1: 5, 4, 2, 5, 4 (not all correct) Trial 2: 2, 2, 5, 5, 1 (not all correct) Trial 3: 1, 4, 5, 4, 2 (not all correct) Trial 4: 1, 5, 1, 2, 1 (not all correct) Trial 5: 2, 5, 2, 3, 4 (not all correct) Trial 6: 1, 1, 1, 4, 4 (ALL correct) Trial 7: 5, 2, 1, 3, 5 (not all correct) Trial 8: 4, 3, 4, 5, 4 (not all correct) Trial 9: 4, 3, 2, 2, 4 (ALL correct) Trial 10: 5, 1, 3, 3, 2 (not all correct) You would want to do at least 30 trials and then get the percentage of times that all questions were answered correctly. In the above example (with 10 trials), all questions were answered correctly 2/10 = 20% of the time. 24. You are going to take a quiz with 10 multiple choice questions, where each question has 4 answer choices. You have not studied and you need to guess on each question. What are your chances of passing the quiz (i.e. getting at least 6 out of 10 questions correct on the quiz)? Perform 20 trials. For this simulation we will generate random #’s from 1-4. Call any # from 1-3 an “incorrectly answered question” and a 4 a “correctly answered question”. For each trial, generate 10 #’s (corresponding to the 10 multiple choice questions). Note: the below trials were gotten using my calculator. If/when you repeat this experiment, you will get different results (since we are dealing with random #’s). Trial 1: 4, 4, 2, 2, 1, 3, 3, 4, 4, 2 (4 correct questions . . . FAIL) Trial 2: 4, 3, 2, 2, 4, 4, 2, 3, 2, 2 (3 correct questions . . . FAIL) Trial 3: 1, 1, 2, 3, 1, 3, 1, 3, 2, 1 (0 correct questions . . . FAIL) Trial 4: 4, 4, 2, 3, 1, 1, 3, 1, 3, 4 (3 correct questions . . . FAIL) Trial 5: 4, 2, 1, 2, 3, 2, 4, 3, 1, 2 (2 correct questions . . . FAIL) You would want to do at least 30 trials and then get the percentage of times that at least 6 questions were answered correctly. In the above example (with 5 trials), the student passed the quiz 0/5 = 0% of the time. 25. Suppose we wanted to study how many credit hours HCC credit students are taking this semester. How would we get a simple random sample (SRS) of HCC students? Stratified sample? Cluster sample? Systematic sample? Convenience sample? Census? SRS: Assign every student a unique #, and generate some random #’s. Strat: Divide students into day and evening, and do a simple random sample on each group. Clust: Go to a randomly chosen building, and do a census of all students in the building. Sys: List students alphabetically, and sample every third student on list. Conv: Offer free drinks/food to HCC students and then ask them to tell you how many credit hours they are taking. Census: Consult HCC database and get information on every student. 26. Which of the following (if any) are NOT valid probability values? a. 0.40 b. -0.20 c. 1.00 d. 0.99999 27. Which of the following (if any) are NOT valid probability values? a. 0.00 b. 0.67 c. 0.80 d. 1.14 Suppose I want to perform the random procedure of rolling a fair die (rolling it once). Use this procedure to answer Questions 28-34. 28. What is the sample space, S, for the above-mentioned procedure? S={1,2,3,4,5,6} Suppose the following events: a. Rolling a 1 b. Rolling anything but 6 c. Rolling something less than 2 d. Rolling something between 2 and 5, inclusive (i.e. including 2 and 5) 29. What are the probabilities for the four events listed above? a. 1/6 b. 5/6 c. 1/6 d. 4/6 30. What are the complements of the four events listed above? a. Rolling something between 2 and 6, inclusive b. { 6 } c. { 2, 3, 4, 5, 6 } d. Rolling a 1 or a 6 31. What are the probabilities of the four complements? a. 5/6 b. 1/6 c. 5/6 d. 2/6 32. State whether each of the following pairs of events (from above) are mutually exclusive or not: a. Events a and b no b. Events a and c no c. Events b and d no d. Events c and d yes 33. For the above die-rolling example, give an example of an impossible event. Rolling a seven 34. For the above die-rolling example, give an example of a certain event. Rolling a # between 1 and 6, inclusive The table below describes a standard deck of cards. Use the table to answer Questions 3536. Note that I am counting aces as face cards. Clubs (black) Spades (black) Hearts (red) Diamonds (red) Face Cards 4 4 4 4 Non-Face Cards 9 9 9 9 35. State whether each of the following pairs of events are mutually exclusive or not: a. Black cards and red cards Yes b. Black cards and diamonds Yes c. Black cards and spades No d. Diamonds and face cards No e. Face cards and non-face cards Yes f. Non-face cards and red cards No 36. Suppose I want to perform the random procedure of picking a card out of a standard deck of cards. If ONE card is drawn, what are the following probabilities (please answer using un-simplified fractions): a. P(club) 13/52 b. P(not a heart) 39/52 c. P(face card) 16/52 d. P(red) 26/52 e. P(not a non-face card) 16/52 f. P(not black) 26/52 g. P(black or red) 52/52 37. One tie – dotted, striped, or solid – is selected at random, and then a shirt – white or brown – is selected at random. What is the probability that a dotted tie AND white shirt are selected? a. 1/6 b. 1/2 c. 1/3 d. 3 e. None of these 38. What is the probability that you roll a die 4 times and get zero “6’s”? 0.482 39. What is the probability that you roll a die 4 times and get at least one “6”? 0.518 40. What is the probability that someone who has 3 children has exactly one girl (assume no twins, triplets, or hermaphrodites)? 3/8 41. What is the probability that you flip a coin twice and get 2 tails? 1/4 42. What is the probability that you flip a coin 10 times and the seventh flip is heads? 1/2 The table below describes a standard deck of cards. Use the table to answer Question 43. Note that I am counting aces as face cards. Clubs (black) Spades (black) Hearts (red) Diamonds (red) Face Cards 4 4 4 4 Non-Face Cards 9 9 9 9 43. Suppose I want to perform the random procedure of picking a card out of a standard deck of cards. If ONE card is drawn, what are the following probabilities (please answer using un-simplified fractions): a. P(face card and black) 8/52 b. P(red or non-face card) 44/52 c. P(face card or not black) 34/52 d. P(black and red) 0 e. P(club or face card) 25/52 f. Given the card is black, what is P(club)? 13/26 g. Given the card is black, what is P(face card)? 8/26 h. Given the card is a non-face card, what is P(face card)? 0 Use the data below to answer Question 44. This synthetic sample data (i.e. I made it up) shows 1,000 people who either smoked or didn’t smoke, and who either died of lung cancer or some other cause of death. Suppose I randomly sample one person from this data. Smoker Non-Smoker Lung Cancer Death 50 80 Non-Lung Cancer Death 150 720 44. a. What is P(Smoker)? 200/1000 b. What is P(Lung Cancer Death)? 130/1000 c. What is P(Smoker given Lung Cancer Death)? 50/130 d. What is P(Non-Lung Cancer Death)? 870/1000 e. What is P(Non-Smoker)? 800/1000 f. What is P(Non-Lung Cancer Death given Non-Smoker)? 720/800 g. Is smoking and lung cancer death independent? No 45. Given P(A)=0.25, P(B)=0.60, and P(A and B)=0.10, find: a. P(A or B) 0.75 b. P(B|A) 0.40 c. Are A and B independent (yes or no)? No d. Are A and B mutually exclusive (yes or no)? No 46. One tie – dotted, striped, or solid – is selected at random, and then a shirt – white or brown – is selected at random. What is the probability that a striped tie OR brown shirt is selected? a. 1/2 b. 2/3 c. 1/6 d. 5/6 e. None of these 47. Suppose TWO fair dice are rolled. Find the following probabilities: a. P(both die are 1) 1/36 b. P(the sum of the dice is 6) 5/36 c. P(at least one of the dice is 4) 11/36 d. P(only ONE of the dice is 4) 10/36 48. Suppose TWO fair dice are rolled. Let E be the event of getting a “triple” (i.e. one die is three times the other die) and let F be the event of getting a “sum of 6” (i.e. the two dice add up to 6). Which one of the following statements is true: P(E)>P(F), P(E)=P(F), or P(E)<P(F)? 49. Suppose a dresser drawer contains 20 individual socks where each sock is either white or black (there is at least one of each color). Suppose you are blindfolded and you start taking out socks from the drawer one by one. What is the MINIMUM number of socks that you need to take out in order to GUARANTEE that you will have some matching socks (i.e. 2 black socks OR 2 white socks). Three 50. For parts a-c below, state whether the pairs of events (events A and B) are dependent or independent: a. P(A)=0.60, P(B)=0.40, P(A and B)=0.24 Ind. b. P(A)=0.90, P(B)=0.30, P(A and B)=0.18 Dep. c. P(A)=0.50, P(B)=0.70, P(A and B)=0.25 Dep. 51. Suppose a jar contains 40 red marbles, 40 blue marbles and 20 green marbles (100 marbles total). If TWO marbles are drawn WITHOUT REPLACEMENT from the jar (that is, one marble is drawn and NOT put back into the jar, and then another marble is drawn), what are the following probabilities? a. P(both are green) (i.e. the first marble is green AND the second marble is green) (20/100)*(19/99) b. P(neither are green) (80/100)*(79/99) c. P(first marble is red) 40/100 d. P(first marble is red, second marble is blue) (40/100)*(40/99) e. P(both marbles are neither red nor green) (40/100)*(39/99) f. P(first marble is red, second marble is green) (40/100)*(20/99) g. Given the first marble is red, what is P(second marble is red)? 39/99 h. Given the first marble is green, what is P(second marble is blue)? 40/99 52. If TWO cards are drawn WITH REPLACEMENT from a standard deck of cards (that is, the first card is put back into the deck (and the deck is shuffled) before the second card is drawn), what are the following probabilities? a. P(both are black) (i.e. the first card is black AND the second card is black) (26/52)*(26/52) b. P(first card drawn is red, second card drawn is black) (26/52)*(26/52) c. P(both cards are neither red nor face cards) (18/52)*(18/52) d. P(first card drawn is a red face card, second card drawn is red) (8/52)*(26/52) e. Given the first card drawn is a red card, what is P(second card is red)? 26/52 f. Given the first card drawn is a club face card, what is P(second card is a diamond face card)? 4/52 53. Give the probabilities for Questions #52A-F assuming the two cards are drawn WITHOUT REPLACEMENT (that is, one card is drawn and NOT put back into the deck, and then another card is drawn) and the deck is shuffled after replacement. a. (26/52)*(25/51) b. (26/52)*(26/51) c. (18/52)*(17/51) d. (8/52)*(25/51) e. 25/51 f. 4/51 54. In Question #52, does the probability of the second draw depend on the first draw? No 55. In Question #53, does the probability of the second draw depend on the first draw? Yes 56. A large department store has 500 employees. There are 350 females and 200 of them are under the age of 25. There are 75 males under 25. If one employee is randomly selected, what are the following probabilities: a. P(under 25 or female) 425/500 b. P(over 25 or female) 425/500 c. P(male or over 25) 300/500 57. There are 6 green hats, 4 blue hats and 3 red hats in a box. You randomly select one hat. What are the following probabilities: a. P(blue or red) 7/13 b. P(not green) 7/13 c. P(green or blue or red) 1 58. In a class of 50 students, 18 take chorus, 26 take band, and 2 take both. Answer the following questions: a. How MANY are only in chorus? 16 b. How many are only in band? 24 c. How many take neither? 8 d. How many take either band or chorus (but NOT both)? 40 59. Does the table below represent a valid probability distribution? x -3 -1.56 2 5.7 10,002 P(x) 0.20 0.10 0.05 0.56 0.09 Yes 60. Does the table below represent a valid probability distribution? X 4 6 8 9 P(x) -0.50 0.60 0.50 0.40 No 61. Does the table below represent a valid probability distribution? X 0 1 P(x) 0.45 0.65 No Suppose a random procedure that yields the following outcomes and probabilities. Use this table to answer Questions 62-63. X 80 100 150 200 250 P(x) 0.24 0.22 0.31 0.18 0.05 62. Find the mean (expected value) and standard deviation of this distribution. Mean=136.2; Std. Dev.=49.9 63. What are the following probabilities: a. P(X<=150) 0.77 b. P(X=200) 0.18 c. P(X<70) 0 d. P(X=100 or X=200) 0.40 e. P(X=100 and X=250) 0 f. P(X<200 or X>80) 1 64. Suppose I have a distribution where one third of the time the value equals -1, one third of the time the value equals 0, and one third of the time the value equals 2. Is this a valid probability distribution? Yes 65. Suppose I have a distribution where one half of the time the value equals 0.4, and two thirds of the time the value equals 0.6. Is this a valid probability distribution? No Use the following table to answer Questions 66-67: X 0 1 2 3 10 P(x) 0.0 0.3 0.3 0.3 0.1 66. Why is this probability distribution valid? The probabilities sum to 1, and each individual probability is between 0 and 1, inclusive. 67. Find the expected value and standard deviation of this distribution. Mean=2.8; Std. dev.=2.5 68. A carnival game offers a $100 cash prize for anyone who can break a balloon by throwing a dart at it. It costs $5 to play. You estimate that you have a 10% chance of hitting the balloon on any throw. Find your expected winnings. $5 69. (De Veaux et al. 2009) A commuter must pass through 5 traffic lights on her way to work and will have to stop at each one that is red. She estimates the probability model for the number of red lights she hits as shown below. X=# of red 0 1 2 3 4 5 0.05 0.25 0.35 0.15 0.15 0.05 P(x) How many red lights should she expect to hit each day? 2.25 red lights 70. An insurance policy has the following pay offs. If you die, your survivor gets $10,000. If you become disabled, you get $5000. Otherwise, you receive nothing. The policy costs $50 a year. Based on past data, the probability a person dies is .01 and the probability the person becomes disabled is .02. Find the expected value from your point of view. $150 71. (De Veaux et al. 2009) You roll a die. If it comes up 6, you win $100. If not, you get to roll again. If you get a 6 the second time, you win $50. If not, you lose. Create the probability model and find the expected amount you’ll win. $23.61 72. A game costs $5 to play. You draw a card from a deck of cards. If you draw the ace of hearts, you win $100. For any other ace, you get $10 and for any other heart you get $5. If you draw anything else, you lose. Find the average winnings or losses for this game. -$1.35 73. Suppose you visit Las Vegas and decide to play roulette. If you bet $5 that the outcome is a number between 1-12 (including 1 and 12), you have a 26/38 probability of losing your $5 bet, and you have a 12/38 probability of making a net gain of $10 (equaling the $15 prize minus your $5 bet). Only considering NET winnings/losses, what is your expected value of betting on a number between 1-12 (round to the nearest cent)? -$0.26 74. A man buys a racehorse for $20,000 and enters it in two races. He plans to sell the horse afterwards hoping to make a profit. If the horse wins both races, it will sell for $100,000. If it wins only one race, it will be worth $50,000. If it loses both races, it will be worth $10,000. The man believes there is a 20% that the horse will win the first race and a 30% chance that it will win the second race. Assuming the two races are independent events, find the man’s expected profit. $10,600 75. Suppose the following binomial probability situation: A certain statistics class has 15 students, and the probability that a given student will pass the class is 0.8. Find the following probabilities: a. P(everybody passes) 0.035 b. P(at least 10 students pass) 0.939 c. P(4 students fail) 0.188 d. P(11 or 12 students pass) 0.438 e. P(at most 2 students fail) 0.398 76. Suppose the following binomial probability situation: You draw a card out of a shuffled deck of cards 10 times (replacing the card after each draw and re-shuffling) and count the number of red cards you draw (note there are 26 red cards and 52 cards total). Find the following probabilities (3 decimal places): a. P(6 red cards) 0.205 b. P(3 black cards) 0.117 c. P(at most 5 red cards) 0.623 d. P(more than 7 black cards) 0.055 77. A moving target at a police academy target range can be hit 80% of the time by a particular individual. Suppose the person takes three shots at the target. What is the probability that: a. There are exactly two hits? 0.384 b. There are hits on all three? 0.512 c. There is only one hit? 0.096 d. There are misses on all three? 0.008 e. There is at least one hit? 0.992 78. A quality control inspector has drawn a sample of 13 light bulbs from a recent production lot. If the number of defective bulbs is 2 or less, the lot passes inspection. Suppose 10% of the bulbs in the lot are defective. What is the probability that the lot will pass inspection? 0.866 79. Suppose the following binomial probability situation: Suppose Dr. Coldren was a single male. Further suppose that there was a week in the distant past (Sunday-Saturday) where he asked a different supermodel for a date (for that evening) each day of the week. Suppose the probability that any given supermodel said “yes” was 0.20. Assume a supermodel agreeing to a date was a “success”, and not agreeing to a date was a “failure” (meaning I stayed home alone for the evening). Find the following probabilities: a. P(Dr. Coldren stayed home alone all week) 0.210 b. P(Dr. Coldren stayed home alone at least one evening) 0.9999872 c. P(Dr. Coldren had a date with a supermodel every evening of the week) 0.0000128 d. P(Dr. Coldren was home alone an odd number of evenings) 0.514 80. Public health statistics indicate that 26.4% of American adults smoke. Describe the sampling distribution for a sample of 50 adults. Under certain assumptions, the sampling distribution of the sample proportions will be normally distributed with mean p=0.264 and std. dev. = sqrt((pq/n)) = sqrt( (0.264*0.736) / 50 ) = 0.062. 81. Assume that 30% of the students at a certain community college wear contact lenses and we randomly pick 100 students to see what percentage of them wear contacts. Describe this sampling distribution. What is the probability that more than one third of them wear contacts? Under certain assumptions, the sampling distribution of the sample proportions will be normally distributed with mean p=0.30 and std. dev. = sqrt((pq/n)) = sqrt( (0.30*0.70) / 100 ) = 0.0458. Probability = 0.233 82. It is believed that 4% of children have a gene that may be linked to juvenile diabetes. Researchers hoping to track 20 of these children for several years test 732 newborns for the presence of this gene. What’s the probability they find enough subjects for the study? 0.96 83. A restaurateur anticipates serving 180 people on a Friday evening and believes that about 20% of the patrons will order the steak special. How many of those specials should he plan on ordering in order to be 95% sure (i.e. only a 5% chance of running out of food) of having enough steaks on hand to meet customer demand? 45 Steaks 84. A college’s data about the incoming freshmen indicates that the mean of their high school GPAs is 3.4 with a standard deviation of 0.35. The distribution is normal. The students are randomly assigned to freshmen writing seminars in groups of 25. a. Find the probability a given student has a GPA greater than 3.5. 0.39 b. Find the probability that one of the groups has an average GPA greater than 3.5. 0.08 85. Ithaca, New York gets an average of 35.4” of rain each year with a standard deviation of 4.2”. Assume the Normal model applies to their yearly rainfall. a. What percentage of years does Ithaca get more than 40” of rainfall? 0.14 b. What rainfall amount separates the “driest” 20% of years from the “wettest” 80%? 31.9 Inches c. Suppose you live in Ithaca for four consecutive years. What is the probability that those four years average less than 30” of rain? 0.005 86. Suppose the weights of men are normally distributed with a population mean of 180 pounds and a population standard deviation of 20 pounds. Suppose a crew of 10 men are about to board a fishing boat. Further suppose the boat can safely carry 10-person crews weighing less than 1900 pounds total (i.e. safely carry 10-person crews where the average crew member weighs less than 190 pounds). Suppose the above-mentioned 10 male crew members were randomly sampled from the overall population of men. Use this information to answer the following: a. What is the probability that any one of the crew members weighs more than 190 pounds? 0.31 b. What is the probability that the entire crew weighs more than 1,900 pounds – and hence a catastrophe is likely to occur? Hint: In other words, what is the probability that the AVERAGE weight of the crew members is more than 190 pounds? 0.06 87. A poll found that 50% of a random sample of 1012 American adults said that they believe in ghosts. a. Find the margin of error for this poll if we want 90% confidence in our estimate of American adults who believe in ghosts. E=0.02585 b. Explain what a “90% confidence interval” means and find the interval. We can be 90% confident that the true population proportion (i.e. the percentage of American adults who believe in ghosts) is contained in the following CI: (0.474,0.526) c. If we want to be 99% confident, will the margin of error be larger or smaller? Larger d. Find that margin of error. E=0.04049 e. In general, will smaller margins of error involve greater or less confidence in the interval? Less 88. (De Veaux et al. 2009) Direct mail advertisers send solicitations to thousands of potential customers in the hope that some will buy the company’s product. The response rate usually is quite low. Suppose a company wants to test the response to a new flyer and sends it to 1000 people randomly selected from their mailing list of over 200,000 people. They get 123 orders from the recipients. a. Create a 90% confidence interval for the percentage of people the company contacts who may buy something. (10.6%, 14.0%) b. Explain what the interval means. We can be 90% confident that the true proportion of the company’s 200,000 customers who will actually purchase an item is in the above interval. c. The company must decide whether to now do a mass mailing. The mailing won’t be cost effective unless it produces at least a 5% return. What does your confidence interval suggest? They should do the mass mailing. 89. A national health organization warns that 30% of the middle school students nationwide have been drunk. Concerned, a local health agency randomly and anonymously surveys 110 of the 1212 middle school students in its city. Only 21 of them reported having been drunk. a. What proportion of the sample reported having been drunk? 21/110 = 0.191 b. Does this mean that this city’s youth are not drinking as much as the national data would indicate? Not necessarily – we need to build a confidence interval. c. Create a 95% confidence interval for the proportion of the city’s middle school students who have been drunk. (11.7%, 26.4%) d. Is there any reason to believe that the national level of 30% is not true of the middle school students in this city? Yes – even the upper bound of the above confidence interval is below 30%. 90. In preparing a report on the economy, we need to estimate the percentage of businesses that plan to hire additional employees in the next 60 days. a. How many randomly selected employers must we contact in order to create an estimate in which we are 98% confident with a margin of error of 5%? 542 b. Suppose we want to reduce the margin of error to 3%. What sample size will suffice? 1504 c. Why might it not be worth the effort to try to get an interval with a margin of error of only 1%? Because it would take a sample size of 13,530. That is a mighty big (i.e. expensive) number. 91. Write the null and the alternative hypotheses for the following: a. In the 1950’s only about 40% of high school graduates went on to college. Has the percentage changed? H0: p=0.4 HA: p≠0.4 b. 20% of the cars of a certain model have needed costly transmission work after being driven between 50,000 and 100,000 miles. The manufacturer hopes that the redesign of the transmission has solved this problem. H0: p=0.20 HA: p<0.20 c. We field test a new flavor of soft drink, planning to market it only if we are sure that at least 60% of the people like the flavor. H0: p=0.60 HA: p>0.60 d. The drug Lipitor is meant to lower cholesterol. Is there evidence to support the claim that over 1.9% of the users experience flu like symptoms as a side effect? H0: p=0.019 HA: p>0.019 e. According to the US department of Health, 16.3% of Americans did not have health insurance coverage in 1998. A politician claims that this percentage has decreased since 1998. H0: p=0.163 HA: p<0.163 f. During the past forty years, the monthly rate of return for a particular item has been 4.2 percent. A store analyst claims that it is different. H0: p=0.042 HA: p≠0.042 92. In the 1980’s it was generally believed that autism affected about 6% of the nation’s children. Some people believe that the increase in the number of chemicals in the environment has led to an increase in the incidence of autism. A recent study examined 384 children and found that 46 of them showed signs of some form of autism. Is there strong evidence that the level of autism has increased (Let alpha=0.05)? Write the hypotheses, check the assumptions, draw the curve, find the pertinent statistics and critical values, find the p value, state your conclusion, etc. H0: p=0.06 HA: p>0.06 Test statistic = z = 4.93 P-value = 0.0000004 (i.e. a really small #) Conclusion: Since the P-value is less than alpha, we can reject the hypothesis that the true population proportion of kids with autism is 6%. The statistical evidence indicates that the true rate of autism has probably increased. 93. During the 2000 season, the home team won 138 of the 240 regular season games. Is this strong evidence of a home field advantage? (Let alpha=0.05) H0: p=0.50 HA: p>0.50 Test statistic = z = 2.32 P-value = 0.01 Conclusion: Since the P-value is less than alpha, we can reject the hypothesis that there is no home field advantage. Therefore, the statistical evidence suggests that there IS a home field advantage. 94. A personal trainer wanted to know whether the proportion of males 30 to 44 years old who do not exercise has decreased from 24.9%, the proportion in 1998. He randomly selects 150 males in that age group and finds that 28 of them do not exercise. Is there significant evidence that the proportion of males in this age group that do not exercise has decreased (Let alpha=0.05)? H0: p=0.249 HA: p<0.249 Test statistic = z = -1.77 P-value = 0.039 Conclusion: Since the P-value is smaller than alpha, we reject the hypothesis that the true percentage of men who don’t exercise is 24.9%. Thus, the statistical evidence suggests that men of this age group are exercising more (i.e. less are NEVER exercising). 95. A survey of 430 randomly selected adults found that 21% of the 222 men and 18% of the 208 women had purchased books online. Is there evidence that men are more likely to make online purchases of books? Use an alpha level of 0.05. H0: p1=p2 HA: p1>p2 Note that in this problem p1 refers to the men. Test statistic = z = 0.88 P-value = 0.19 Conclusion: Since the P-value is larger than alpha, we cannot reject the hypothesis that the true percentage of men and women who purchase books online are equal. The statistical evidence suggests that men don’t appear to be more likely to make online book purchases. 96. Would being part of a support group that meets regularly help people who are wearing the nicotine patch actually quit smoking? A county health department tries an experiment using several hundred volunteers who are planning to use the patch. The subjects were randomly divided into two groups. People in Group 1 were given the patch and attended a weekly discussions meeting with counselors and others trying to quit. People in Group 2 also used the patch but did not participate in the counseling groups. After six months 46 of the 143 smokers in Group 1 and 30 of the 151 smokers in Group 2 had successfully stopped smoking. Do these results suggest that such support groups could be an effective way to help people stop smoking? Use an alpha level of 0.05. H0: p1=p2 HA: p1>p2 Note that in this problem p1 refers to Group 1 (i.e. the group who get counseling). Test statistic = z = 2.41 P-value = 0.008 Conclusion: Since the P-value is smaller than alpha, we reject the hypothesis that the true percentage of people who quit smoking is equal. The statistical evidence suggests that the counseling is beneficial to quitting smoking. 97. When games were sampled from throughout a season, it was found that the home team won 127 of 198 professional basketball games, and the home team won 57 of 99 professional football games. Based on these results, does there appear to be a significant difference between the proportions of home wins for the two sports? What can we conclude about home field advantage for these two sports? Do hypothesis test with alpha=0.05 (Triola 2008). H0: p1=p2 HA: p1≠p2 In this problem p1 will refer to basketball. Test statistic = z = 1.10 P-value = 0.27 Conclusion: Since the P-value is larger than alpha, we cannot reject the hypothesis that the true percentage of home games won in basketball vs. football is the same. The statistical evidence suggests that there doesn’t appear to be MORE of a homefield advantage for one sport over the other. NOTE THAT WE DID NOT TEST WHETHER EITHER OF THE SPORTS HAS A HOME-FIELD ADVANTAGE. We just tested whether one sport has MORE of a home-field advantage. 98. A gender selection methodology called “XSORT” yielded the following results for parents who WANTED a girl: 295 out of 325 babies born using the method were girls. For those parents who wanted a boy (the “YSORT” method was used for these parents), 39 out of 51 babies were boys. Perform a hypothesis test with alpha=0.05 for the difference between the proportions of boys and girls being born using these gender selection methodologies (Triola 2008). H0: p1=p2 HA: p1≠p2 In this problem p1 will refer to girls. Test statistic = z = 3.01 P-value = 0.003 Conclusion: Since the P-value is smaller than alpha, we reject the hypothesis that the two population proportions are the same. The statistical evidence suggests that the XSORT (girl) methodology is superior to the YSORT (boy) methodology. NOTE THAT WE DIDN’T TEST WHETHER EITHER OF THEM IS EFFECTIVE . . . WE JUST COMPARED THE TWO METHODS. 99. During an angiogram, heart problems can be examined via a small tube (a catheter) threaded into the heart from a vein in the patient’s leg. It’s important that the company who manufacturers the catheter maintain a diameter of 2.00 mm. Each day, quality control makes several measurements to test the 2.00 mm standard. What would Type I and II errors be? H0: µ=2 HA (2 sided): µ≠2 HA (1 sided): µ<2 or µ>2 Type I: Everything is really okay, but production is stopped (this costs $$$) Type II: The catheters are faulty, but production continues and patients die. 100. Suppose the elapsed time of airline itineraries between Washington, D.C. and Boston is normally distributed with an unknown population mean and an unknown population standard deviation. Further suppose that a sample of size 25 (therefore, n=25 and degrees of freedom=24) was taken and the following statistics were gotten from the sample: sample mean (ybar) =135 and sample standard deviation (s) = 40. Construct confidence intervals around the sample mean corresponding to the following confidence levels (express the lower and upper bounds of the intervals as INTEGERS): a. 80% b. 90% c. 95% d. 98% e. 99% (124, 146) (121, 149) (118, 152) (115, 155) (113, 157) Hint: Your intervals should get wider and wider and should all be centered around 135. 101. Suppose the elapsed time of airline itineraries between Washington, D.C. and Boston is normally distributed with an unknown population mean and unknown population standard deviation. Suppose we randomly sample 25 itineraries and the sample average is calculated to be 135 minutes and the sample standard deviation is calculated to be 40. Further suppose that we want to test the hypothesis that the true population mean (μ) equals 150 minutes. Conduct 2-sided hypothesis tests with the following alpha levels: a. 0.20 Reject b. 0.10 Reject c. 0.05 Don’t Reject d. 0.02 Don’t Reject e. 0.01 Don’t Reject H0: µ=150 HA: µ≠150 Test statistic (t) = -1.875 P-value = 0.0730 Conclusions: See above 102. (De Veaux et al. 2009) Hoping to lure more shoppers downtown, a city builds a new public parking garage. The city plans to pay for the structure through parking fees. During a two month period (44 week days) daily fees collected averaged $126 with a standard deviation of $15. If a consultant claimed that the average daily income would be $130, should we reject her claim using alpha=0.10 (perform a 2-sided test)? H0: µ=130 HA: µ≠130 Test statistic (t): -1.77 P-value: 0.08 Conclusion: Reject the null. It is likely that the consultant’s claim is false. 103. In 1998, the Nabisco Company announced a “1000 Chips Challenge” claiming that every 18 ounce bag of Chips Ahoy contained at least 1000 chocolate chips. Below are the counts of chips in selected bags. 1219 1132 1214 1191 1087 1270 1200 1295 1419 1135 1121 1325 1345 1244 1258 1356 Perform a one-sided test (HA: µ>1000). What does this evidence say about Nabisco’s claim (let alpha=0.05)? H0: µ=1000 HA: µ>1000 Test statistic (t) = 10.1 P-Value = tiny # Conclusion: We can reject the null hypothesis. It appears that Nabisco’s claim is valid. 104. When consumers apply for credit, their credit is rated using FICO scores. A random sample of credit ratings is obtained, and the FICO scores are summarized with these statistics: n=25, ybar=680, s=22. Use an alpha of 0.01 and do a 1-sided hypothesis test to test the claim that the mean credit score (of the general population) is less than 700 (Triola 2008). H0: µ=700 HA: µ<700 Test statistic (t) = -4.54 P-Value = 0.00006 Conclusion: We reject the null hypothesis. There is enough statistical evidence to conclude that the mean is probably less than 700. 105. Different cereals are randomly selected, and the sugar content is obtained for each cereal, with the results given below for Cheerios, Harmony, Smart Start, Cocoa Puffs, Lucky Charms, Corn Flakes, Fruit Loops, Wheaties, Cap’n Crunch, Frosted Flakes, Apple Jacks, Bran Flakes, Special K, Rice Krispies, Corn Pops, and Trix. Use an alpha of 0.05 to test the claim of a cereal lobbyist that the mean of all cereals is LESS than 0.3 g (Triola 2008). 0.03 0.44 0.24 0.39 0.30 0.48 0.47 0.17 0.43 0.13 0.07 0.09 0.47 0.45 0.13 0.43 H0: µ=0.3 HA: µ<0.3 Test statistic (t): -0.12 P-value: 0.45 Conclusion: Fail to reject the null. The statistical analysis does not back up the lobbyist’s claim. 106. A study was conducted to assess the effects that occur when children are exposed to cocaine before birth. 190 children born to cocaine users had a mean score of 7.3 (with a standard deviation of 3.0) on a certain aptitude test. 186 children not exposed to cocaine had a mean score of 8.2 with a standard deviation of 3.0. Use an alpha of 0.05 to test the claim that cocaine use is harmful to children’s aptitude (Triola 2008). H0: µ1=µ2 HA: µ1<µ2 Test statistic (t) = -2.91 P-value = 0.002 Conclusion: Reject the null. We can conclude with reasonable certainty that cocaine is bad for children. Moral: Don’t use cocaine when you are pregnant. 107. Use the following data (representing hospital admissions from motor vehicle crashes) and an alpha of 0.05 to test the claim that Friday the 13ths are unlucky (Triola 2008): Friday the 6th (immediately preceding the 13th) 9 6 11 11 3 5 H0: µd=0 HA: µd>0 Test Statistic (t) = 2.71 Friday the 13th 13 12 14 10 4 12 P-Value = 0.02 Conclusion: Since the P-Value is less than alpha, we can reject the null. The statistical evidence appears to show that Friday the 13th is indeed unlucky (more data is probably needed to conclusively show that Friday the 13ths are unlucky). 108. A study was conducted to investigate the effectiveness of hypnotism in reducing pain. Results for randomly selected subjects are given below. The measurements represent a pain scale (where higher #’s indicate more pain). Use an alpha of 0.05 to test the claim that hypnosis lowers pain. Before Hypnosis 6.6 6.5 9.0 10.3 11.3 8.1 6.3 11.6 After Hypnosis 6.8 2.4 7.4 8.5 8.1 6.1 3.4 2.0 H0: µd=0 HA: µd>0 Test Statistic (t) = 3.04 P-Value = 0.009 Conclusion: Since the P-Value is less than alpha, we can reject the null. The statistical evidence appears to show that the hypnosis treatment is effective in reducing pain. 109. To test the effectiveness of a drug to relieve asthma, a group of subjects was randomly given a drug and placebo on two different occasions. After 1 hour an asthmatic relief index was obtained for each subject, with these results: Use 0.05 for alpha. Is the drug effective (Hint: low numbers are good!)? Subject Drug Placebo 1 28 32 2 31 33 3 17 19 4 22 26 5 12 17 6 32 30 7 24 26 8 18 19 9 25 25 H0: µd=0 HA: µd>0 Test Statistic (t) =2.75 P-Value = 0.01 Conclusion: Since the P-Value is less than alpha, we can reject the null. The statistical evidence appears to confirm that the drug is effective. 110. Here is a table showing who survived the sinking of the Titanic based on whether they were crew members or passengers booked in first, second or third-class staterooms: Alive Dead Total Crew 212 673 885 First 203 122 325 Second 118 167 285 Third 178 528 706 Total 711 1490 2201 Determine if surviving was independent of cabin status (use alpha=0.01). H0: Cabin class and survivorship are independent. HA: Cabin class and survivorship are dependent. Test Statistic (X2) = 190.4 P-Value = Tiny, tiny # Conclusion: Since the P-value is smaller than alpha, we reject the null hypothesis. Therefore, we can conclude that there is an association (dependence) between cabin class and survivorship. 111. Use the following data to do a test of independence to see if left-handedness is independent of gender (use alpha=0.05): Male Female Left-Handed 17 16 Right-Handed 83 184 H0: “Handedness” and gender are independent. HA: Handedness and gender are dependent. Test Statistic (X2) = 5.52 P-Value = 0.019 Conclusion: Since the P-value is smaller than alpha, we reject the null hypothesis. Therefore, we can conclude that there is an association (dependence) between handedness and gender. 112. Use the following data to do a test of independence to see if height is independent of gender (use alpha=0.05): Male Female Short 3 17 Tall 25 2 H0: Height and gender are independent. HA: Height and gender are dependent. Test Statistic (X2) = 28.72 P-Value = Small # Conclusion: Since the P-value is smaller than alpha, we reject the null hypothesis. Therefore, we can conclude that there is an association (dependence) between height and gender. 113. A die is filled with a lead weight and then rolled 200 times with the following results: 1: 27 2: 31 3: 42 4: 40 5: 28 6: 32 Use an alpha of 0.05 to test the claim that the outcomes are not equally likely (Triola 2008). H0: The die rolls are evenly distributed over 1-6. HA: The die rolls are not evenly distributed over 1-6 (i.e. the die is not “fair”). Test Statistic (X2) = 5.86 P-Value = 0.32 Conclusion: Since the P-value is greater than alpha, we cannot reject the null hypothesis. Therefore, there is not enough statistical evidence to support the claim that the die is not fair. 114. The following data lists automobile fatalities by day of week: Sun: 132 Mon: 98 Tue: 95 Wed: 98 Thu: 105 Fri: 133 Sat: 158 Use an alpha of 0.05 to test the claim that the outcomes are not uniformly spread across the days of the week (Triola 2008). H0: Automobile fatalities are evenly distributed over all seven days of the week. HA: Automobile fatalities are not evenly distributed over all seven days of the week. Test Statistic (X2) = 30.02 P-Value = Small # Conclusion: Since the P-value is less than alpha, we can reject the null hypothesis. Therefore, the statistical evidence suggests that driving fatalities are not evenly spread throughout the days of the week. 115. The following data lists the birth months of Oscar-winning actors: Jan: 9 Feb: 5 Mar: 7 Apr: 14 May: 8 Jun: 1 Jul:7 Aug: 6 Sep: 4 Oct: 5 Nov: 1 Dec: 9 Use an alpha of 0.05 to test the claim that the outcomes are not uniformly spread across the months (Triola 2008). H0: Actor birth months are uniformly spread out over all 12 months. HA: Actor birth months are not uniformly spread out over all 12 months. Test Statistic (X2) = 22.54 P-Value = 0.02 Conclusion: Since the P-value is less than alpha, we can reject the null hypothesis. Therefore, the statistical evidence suggests that actor birth months are NOT evenly spread out throughout the year. 116. You are planning to open an old time soda fountain and your partner claims that the public will not prefer any flavor over another. The flavors you serve are cherry, strawberry, orange, lime and grape. After several customers, you stop and take a look at how sales are going and here are the results. The following numbers of people ordered the flavor shown. Cherry 35, Strawberry 32, Orange 29, Lime 26 and Grape 25. Test to see if there was a preference at the 0.05 significance level. H0: The customers have no preference for any flavor. HA: The customers have flavor preferences. Test Statistic (X2) = 2.35 P-Value = 0.67 Conclusion: Since the P-value is larger than alpha, we cannot reject the null hypothesis. Therefore, there is no statistical evidence that customers prefer some flavors more than others.