Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Math 3307 Lecture Notes Perkowsky text May’13 Chapters 7 – 8 Homework Assignments 10 points each problem ( 140 points total) Chapter 7 2, 4, 6, 12, 14, 16, 20 Chapter 8 2, 4, 6, 8, 10, 14, 16 Homework style sheet and rules: Work on one side only; pdf it and upload it before the deadline on the calendar. Work that is poorly scanned or illegible will be given a zero. This includes sideways or upside down scans! Do NOT crowd the work, leave at least 3” between problems. Label the answers carefully so the grader can grade efficiently. 1 Chapter 7 – Random Variables and Probability Distributions 7.1 What is a Random Variable? Technically: A quantitative variable whose value is determined by the outcome of a chance experiment. The book’s example of a free throw is excellent! It starts on page 183… Let’s discuss the Classroom Exploration on page 185 together Why 6 blue/4 red…why not 5/5 And let’s check out the probability table! Would you have thought of a grid? Which type of learner is likely to appreciate which presentation? Can we turn it into a tree diagram? NOTE TI simulation page 185 This is VERY useful for making up worksheets, quizzes, and tests. 2 Now let’s turn the information into the most abstract representation of all: Discrete Random Variable: a finite number or a countable number of outcomes. Continuous Random Variable: inifinitely many variables, situated on a numberline with no gaps or interruptions. Which of the following are discrete? Continuous? The number of eggs received by the shipping department at the local Krogers on a given day. The number of people marching in the Fourth of July parade downtown Houston. The measure of voltage for a smoke detector in your kitchen. The temperature in Houston. The exact playing time for a given baseball game. The number of actors in a randomly selected movie. The weight of a randomly selected human. 3 Probability Distribution Table page 186 Probability Density Table? Review the rules, page 130! Problem 1 In a drug study, there is a control group and a group of people not taking the drug. The drug is to help you have girls for children. These are VERY large groups. Here is a table for the control group. Is it a Probability Distribution Table? X P(X) 0 .125 1 .375 2 .375 3 .125 4 What is X? Are all possibilities covered? How did they get those numbers? Do they add up right? Check out the example on your own (top page 187) Let’s do the Focus on Understanding together. Read page 188 and we’ll discuss the 3 questions in the box on page 189 5 Fair or Unfair? Page 188 Let’s do this problem and save the notes. We’ll come back to it in a bit. Get in pairs and play the game using your TI as your simulator. Get two random numbers per turn and follow the rules. Now let’s make a probability distribution table for the game! 6 7.2 The Mean of a Random Variable When we discuss the measure of center for a random variable, we’ll call it the expected value (E(x)). It’s like a mean but in a different context. It is a long term average value. Let’s review the grid on page 186: 100 outcomes: 40 24 36 zero’s one’s two’s If you’re Nicky’s coach you’ll have an expectation of how she’ll do in the free throw situation…this expectation is the “mean”. Of course there’s a formula! (see the box, page 190) Multiply each outcome value by the probability that it may happen. Add. 0(40%)+1(24%)+2(36%) = 96% this is the expected value for a trip to the line Now let’s review: What if Nicky were an 80% shooter? Focus on Understanding page 191 7 EV problem 1 Years ago, members of organized crime groups ran numbers games. Now such games are legalized. New Jersey’s Pick 3 game works this way: Bet 50 cents and select a three-digit number between 000 and 999. If your 3 digits match the numbers drawn, you win $275. Prob (winning): Amount won: Prob (losing): Amount lost: Is this a fair game? 8 EV Problem 2 The CAN Insurance Company charges Mike $250 for a one-year $100,000 life insurance policy. Because Mike is a 21 year old male, there is a 0,9985 probability that he’ll live for that year. What are the outcomes and their probabilities? What are the financial outcomes for each probability? What is the expected value? 9 The Prime Number Multiplication Game page 191 Teams! 10 Hard choices! Page 192 Teams! Try setting it up as a 6 x 6 table or grid! HINT: 11 7.3 Variance and Standard Deviation Measures of spread and variability – we have them in this context, too! Let’s look at this from a vocabulary standpoint: Deviation (from the mean): (x - ) Squared deviation: (x - )2 Let’s sketch this: Note that the further from the mean a point is, the bigger the squared deviation! Now let’s look again at Tasa’s possible earnings for mowing the grass again. See page 193 See page 195 for the formula for variance (remember standard deviation is the square root of variance!) Let’s decode it! Calculate the squared deviation, multiply each times it’s probability…add them up. See page 194, bottom, for Tasa’s variance. See page 195 for an alternate version of the formula. 12 For Option 2B, let’s check out the top of page 196 and do some comparisons. Now let’s move into Standard Deviation. What is standard deviation? Let’s go around the room and discuss what it is! What is the z-score? Again, around the room… 13 Let’s check out using the TI for finding these numbers: page 198 – 199 We’ll do this with the lawn mowing data! And for the sum of two cubes data from the 7.1 experiment. 14 7.4 Binomial Random Variables Often we have a situation with repeated identical trials. Tossing a free throw (it goes in or it doesn’t), tossing a coin (heads/tails), landing a plane (ok/crash), having a baby (boy/girl), taking a T/F test. These trials need to be independent of one another! If they are, then we may multiply the individual probabilities for the outcomes. Let’s analyze the standard 2 child family: The 3 child family: 15 Note that we will be using COMBINATIONS when we count outcomes: Let’s look at the 3 child family again: BBB GGG 2B1G 1B2G The combination of 3 kids taken 2 at a time: 3 3 3! C 2 3 2 2!1! P(3girls) = 3/8 Summary of Binomial Experiments/Probability: page 202 and 203 Let’s review it carefully! 16 The complement rule: Application of it: page 204 middle, gray box Are there mean, variance, and standard deviation? Bet your grade on it. Summary page 205, bottom, box 17 TI – let’s learn how to do this efficiently: page 206 18 Now we know it’s binomial: check those possibilities again with the formula. In a drug study, there is a control group and a group of people not taking the drug. The drug is to help you have girls for children. These are VERY large groups. Here is a table for the control group. Is it a Probability Distribution Table? X P(X) 0 .125 1 .375 2 .375 3 .125 What is the Expected Value? Mean? Standard Deviation? What does the histogram look like? 19 Which of the following are binomial experiments? Surveying 1000 people and asking them to rate the president on a scale of 1 – 5 Rolling a fair die 50 times Having kids Determining whether 12,000 pacemakers are defective or not, one by one Guessing on a T/F test Guessing on a test with 5 answer choices per question Compute the following binomial probabilities: A. n = 2, x = 0, p = .01 B. n = 10, x = 4, p = .95 C. n = 7, x = 2, p = .35 D. n = 6, x = 4, p = .16 20 BP Problem 1 Bob is a self-proclaimed mentalist who claims he can read minds. To test this, he is given 14 T/F questions. A. He gets 8 of them right. What is the expected value and is this unusual? B. He gets 11 of them right. What is the expected value and is this unusual? C. He gets 2 of the right? EV is? Is this unusual? 21 BP Problem 2 There is a 0.723 probability that an airplane will land on time at Hobby. Discuss whether that result would be considered unusual or normal. A Find the probability that at least 5 out of 6 airplanes arrive on time in a given period of time. B Find the probability that at most 2 airplanes arrive on time in a given period of time. C Find that probability that exactly 3 airplanes land on time in a given period of time. 22 BP Problem 3 Internal surveys show that directory assistance providers give the wrong number 15% of the time. Assume you are testing a provider by making 10 requests. Assume further that this is a very average company and gives wrong answers 15% of the time. Find the probability of getting one wrong answer. Is this unusual? Find the probability of getting at most one wrong answer. Is this unusual? Is the probability really 15% for this company? 23 BP Problem 4 A study was conducted to determine whether there were significant differences between medical students admitted through special programs and medical students admitted through the regular admissions criteria. It is claimed that the graduation rate for the students admitted through the special programs is 94%. If 10 students from the special programs are randomly selected, find the probability that at least 9 of them graduated. Would it be unusual to randomly select 10 and find that 7 graduated? Why or why not? 24 7.5 The Normal Curve The standard normal curve the bell curve symmetric, mound shaped, continuous Let’s discuss continuous versus discrete For the standard normal curve the mean is zero and the standard deviation is 1. It is symmetric about z = 0…not x? why not? Probabilities correspond to area under the curve. Let’s review the Empirical Rule (p. 71) right now with a picture: 25 Now let’s look at the standard normal probability table. Given a z-score of 1.28, what is the probability that a measurement is at or below this value? page 210 Now for using the chart with “greater than or equal to”…a version of the complement rule! Or between two measurements! 26 Using the table in reverse: from a probability to a z-score: Page 212 In reality, MOST normal curves are NOT standard! How do we rescale to make use of our standard normal chart? With z-scores! All normal curves are proportional and we use the z-score calculation to make them “fit” the table. Page 213 27 Focus on Understanding: page 215 Using the TI to do this, chart-free! Pages 216 – 218 28 From another source – TI83 instructions for Areas between two bounds: 2nd VARS [2: normal cdf(left z score, right z score)] Normal Distributions: The Precision Scientific Instrument Company manufactures thermometers. To check the accuracy, they test the thermometers in freezing water and make sure it registers 0 degrees F. Of course some are high and some are low. Assume there is a standard deviation of 1 degree F. Find the area and show it on a standard normal curve! What is the probability that the reading is less than 1.58°? You should get 94.29% What is the probability that the reading is above −1.23°? You should get .8907 29 What is the probability that the reading is between −2° and 1.5°? You should get 91.04% Working backwards in the chart: Find the temperature associated with the 95th percentile. z = 1.645 How does this work? Find the temperatures separating the bottom 2.5% and the top 2.5% These are called tolerances. (−1.96 and 1.96 for z’s). How does this work? 30 Fill in the blanks: About _________% of the area is within 1 standard deviation of the mean About _________% of the area is within 2 standard deviations of the mean About _________% of the area is within 3 standard deviations of the mean Find the probabilities: P ( z 1.645) P ( z 2.575) P (1.96 z 2.33) 31 Find the following percentiles: P95 P75 P50 P35 Enrichment: c 5 10 Here is a probability distribution. Find the value of c. Find the probability that x is between 0 and 3. Find the probability that x is between 2 and 9. 32 ND problem 1 Air Force ejection seats are designed for people weighing between 140 lb and 211 lb. Women’s weights are normally distributed with a mean of 143 lb and a standard deviation of 29 lb. What percentage of women have weights in those limits? 33 ND problem 2 The airline industry wants the passenger seats to fit 98% of all males flying. Men have hip widths that are normally distributed with a mean of 14.4 inches and a standard deviation of 1 inch. Find P98 and the associated seat width. What is the formula for standard deviation? x z 34 ND problem 3 The lengths of pregnancies are normally distributed with a mean of 268 days and a standard deviation of 15 days. A woman wrote to Dear Abby claiming that she gave birth 308 days after a brief visit with her husband who was fleet Navy and ship bound else. Is this credible? Premature is being born in the 4th percentile of length…what length of time is this? Can you figure out how we could use this fact to help hospital administrators? 35 7.6 Normal Approximations Sometimes, when everything is right, you may approximate a binomial distribution as a normal distribution and use the far easier calculations for the normal distribution. What is “everything is right”? 1. the binomial distribution is “smooth” , not “chunky” 2. the binomial distribution is symmetric, not skewed 3. The number of data points times the minimum (p, q) > 5 If you’ve got these three things you are good to approximate. 36 Now, we can look at the TI way to do this (page 221, bottom). Let’s compare with another way on page 222. We’ll use the “continuity correction” (page 223) with abandon! It’s a sort of “split the difference” way to manage the discrete nature of real binomial data! Let’s go through the calculations on pages 224 – 226 37 Now let’s look at “tossing tacks” page 227 38 Approximating normal When an airliner is loaded with passengers, baggage, and cargo plus fuel, the pilot must verify that the gross weight is below a maximum and that the weight is properly distributed for safety. An airline has established a procedure in which extra cargo must be eliminated whenever a 200 person plane has at least 120 men. Assume that the population is 50/50 men and women. Check to make sure we can approximate: Get the mean and standard deviation: Mean: Sigma: did you get 7.0710678? Continuity Correction: 119.5 to 120.5 39 We want “at least 120 men”…120 and to the RIGHT…sketch this! Now find the area that is shaded. z = 2.76 What is the probability? Do we need to worry much about this? 40 Using continuity corrections: Wording: At least 120 to the right of 119.5 More than 120 to the right of 120.5 At most 120 to the left of 120.5 Fewer than 120 to the left of 119.5 Exactly 120 between 119.5 and 120.5 41 AN Problem 1 In a study of 420,000 cell phone users in Denmark, it was found that 135 developed brain cancer. Assuming cell phones have no effect, there is a 0.000340 probability of a person developing brain cancer. We would, then, expect 143 cases among 420,000 randomly chosen people. Estimate the probability of 135 or fewer cases of such cancer in the randomly chosen population. What do these results suggest about media reports that cell phones cause brain cancer? 42 AN Problem 2 After being rejected for employment, Ms. Kim learns that this company has hired only 21 women applicants among its 62 new employees. She also learns that the pool of applicants is very large with equal numbers of qualified men and women. The company claims no unfair discrimination in hiring. Kim feels differently. Run the numbers and decide how you feel. 43 AN Problem 3 45% of humans have Type O blood. A hospital is running low on Type O blood and runs a blood drive…it needs 177 units of this type of blood. Assume 1 unit per donor. If 400 volunteers show up, what is the probability that at least 177 of them will have Type O blood? Are the 400 volunteers enough? 44 Chapter 8 – Distributions from Random Samples 8.1 Random Sampling Let’s go with the book’s comment about defining “random” by what it’s NOT: systematic, logical, having a clear pattern or order. In statistics, random has to do with the process of picking a sample – each element in the population has an equally likely chance to be chosen. Let’s look at Classroom Exploration 8.1 page 235 Let’s read it – will there be repetitions in the scenario? Plan A 24 cards, one name per card Plan B roll a die – the number on top is the row number Plan A Plan B how many possible samples? Equally likely? how many possible samples? Equally likely? Question 3 and Question 4 Picking Amy? Let’s now read page 237 at the top: an exerpt… Note that in this part of the class we are doing inferential statistics – we want to infer some conclusion about the population from our work…and we want to quantify how reliable this conclusion is. 45 Now let’s read the Focus on Understanding project that starts on page 237…and check out the results from doing it on page 240. What do you notice about the dot plots? What can you conclude about small samples vs bigger samples? Note that we look at a range of values for the mean – why do we do this? What are we trying to ensure by doing this? Focus on the discussion on page 241 in the middle of the page for a discussion about these ideas. 46 8.2 The Distribution of Sample Means The mean of a random sample is an estimator of the true population mean. It can be a good estimate or a poor estimate. We want to ensure that it’s a good one! How can we do this? A we want the mean to be unbiased We can check this by finding the expected mean of the SAMPLE means. If the expected mean is the true mean, then the sample is unbiased. Operationally, the more perfectly random your samples, the more unbiased your sample means are. B we want a large sample size, not a small one Operationally, n = 30 is the best minimum sample size, but more is better if you can afford it! When we have these, then the distribution of sample means is normally distributed about the true mean, . This is so important! And took so long to discover! Page 249 The Central Limit Theorem: Regardless of the distribution of the population being sampled, the distribution of sample means taken from random samples of size n is approximately normally distributed when n is large. See the caution on page 240 at the bottom of the last paragraph. 47 The mean of the sample means is the true population mean and the standard deviation is the population standard deviation divided by the square root of n. x x n Let’s discuss that standard deviation: Suppose n is small Suppose n is large Now compare the two dot diagrams on page 240 again. So now, suppose we have 50 samples (random!) and we calculate the mean of each. We then have a list of sample means as our data. We find the mean of these sample means and the standard deviation of these sample means. What do we know about the original population? We know the means are the same and we can multiply our standard deviation to get the original population standard deviation. Do you see how? What DON’T we know? The shape of the original distribution! 48 Let’s look at the example on page 250: Back to Nicky’s free throws! Recall her distribution (page 250 – mean is .96). Now we’ll look at a simulation of size 50. Let’s go through the calculations to find the mean and standard deviation for the distribution of the sample means. How do you find the mean and the standard deviation? What are the formulas? WHERE are the formulas in the textbook? Now let’s walk through Lauren’s simulation of doing 50 free throws and calculate the probability that Lauren’s sample mean will be within .1 of the actual mean. See page 251 49 Suppose we do this 4 times and take the AVERAGE mean from those 4 attempts…will this be more accurate than doing it just once? Why or why not? What we are doing here with that “0.1” is finding an error bound or margin of error. The probability that our estimate is within the given error bound is what we calculated in this example. The probability is called the “confidence level” of our estimate. The confidence level of an estimate goes up as n increases. Let’s review our procedure from a Big Picture viewpoint. We got our sample and calculated the mean We then went to z-scores* to find the probability “between” We used Table 1 or our calculators to get the probability We described our confidence level in our estimate *and we used the distribution of the SAMPLE MEANS not the original distribution in our calculations! 50 SD – Problem 1 A company that specializes in data analysis tests all its applicants for employment by having them solve three short problems that are indicative of the type of work they will be required to perform. An applicant is given a score from 0 to 10 for each problem. From the performances of previous applicants, the sampling distribution of mean scores has been found to be as shown in the table below. Sketch this distribution on the right: Mean Prob 0 .001 1 .005 2 .010 3 .045 4 .060 5 .100 6 .150 7 .350 8 .200 9 .070 10 .009 Use your calculator to find the mean (6.570) and standard deviation (1.63) Page numbers for formulas: Check the Empirical Rule on your distribution. What is the z-score for 8? 51 SD Problem 2 The number of patients admitted per day to a medium-sized regional hospital is 35 with a standard deviation of 10. If, on a given day, there are 60 beds available for new patients, do you think the hospital will have to divert emergency vehicles to another hospital? SD Problem 3 The sampling distribution of X, the number of people who arrive at a cashier’s counter in a bank per minute is given below: X P(X) 0 .36 1 .38 2 .18 3 .06 4 .02 Verify the Empirical Rule. 52 53 8.3 The Distribution of Sample Proportions Proportions have a place in statistics. And we use a sample proportion from a random sample to estimate the true proportion of a population that has a specific property often. Let’s look at Classroom Exploration 8.3 on page 253… Let’s look at “drawing more blocks” page 254… And look at the proportion on the bottom of page 254 to see how this differs a bit from a sample mean. Class discussion: What are the differences? “hat” or caret notation is discussed on page 255 at the top…we have special notation to use when we are talking about a sample proportion p The expected value of “p-hat” The standard deviation of “p-hat” page 256 page 258 The distribution – no surprises here! 54 SP Problem 1 Suppose a warship takes 6 shots at a target, and it takes at least 4 hits to sink the target. If the warship has a record of hitting with 20% of its shots, in the long run, what is the probability of sinking the target. Is this binomial? Sketch the distribution … make a table first. Answer the question. 55 SP Problem 2 Let’s consider the 107th Congress: There are 100 senators 9(2 per state). At that time, there were 87 males and 13 females. What is the population proportion of each type of senator. Suppose we take random samples of size 10. S1 S2 S3 S4 S5 MFMMFMMMMM MFMMMMMMMM MMMMMMFMMM MMMMMMMMMM MMMMMMMMFM Calculate the sample proportions. Now suppose we go on and do 95 more samples resulting the in the following table: Sketch the frequency table: Prop F Freq 0.0 26 0.1 41 0.2 24 0.3 7 0.4 1 0.5 1 Check that the mean is 0.119 and the standard deviation is 0.100 56 What would the frequency table look like if we did this 10,000 times? 57 SD Problem 3 Here is the population of all 5 US Presidents who had professions in the military along with their ages at inauguration: Eisenhower Grant Harrison Taylor Washington (62) (46) (68) (64) (57) Assume that samples of size 2 are randomly selected WITH REPLACEMENT. How many samples are possible? What is the mean of each sample? Make a frequency table for these means…is this a sampling distribution? What is the distribution for these means? What is the mean of the table? How does this compare with the actual mean of the presidents? 58