Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistical patterns Business Statistics 41000 Fall 2015 1 Topics 1. Probability rules 2. Random variables and distributions 3. Expected value 4. Using statistics to make decisions 2 Statistical pattern A statistical pattern is a pattern which holds only approximately. For example, most professional basketball players are tall. It isn’t necessary to be tall, nor is it sufficient to be tall. But, as a generality, NBA players are tall. This generality may not hold for any given player, but as a statement about the aggregate population of NBA players, it is valid statistically speaking. 3 Topic: probability Probability is a language for talking and thinking about such statistical patterns. The key idea is to assign a number between 0 and 1 to each event, which reflects how likely that event is to occur. The language has three rules: 1. If an event A is certain to occur, it has probability 1, denoted P(A) = 1. 2. If two events A and B are mutually exclusive (both cannot occur simultaneously), then P(A or B) = P(A) + P(B). 3. P(not-A) = 1 − P(A). See OpenIntro sections 2.1 and 2.6.1. 4 Key example: random draws from a database A critical example moving forward will be the idea of randomly selecting items from a database of, say: I Customers at a grocery store with purchase histories. I Stock price histories for publicly traded firms. I Patients at a doctors office along with symptoms and diagnoses. I Anything of interest that one might measure and recorded... This framework allows us to talk about the proportion of items in the database satisfying a given property. 5 Probability is basically just fractions The following statements are equivalent: I “25% of Dr. Smith’s patients got a flu shot.” I “1 in 4 of Dr. Smith’s patients received a flu shot.” I “The probability that a randomly selected patient from Dr. Smith’s patient database got a flu shot is 1/4.” I “The probability you had a flu shot if you were a patient of Dr. Smith’s is 0.25.” 6 Classic example: fair six-sided die The possible outcomes are {1, 2, 3, 4, 5, 6} or {one, two, three, four, five, six} according to our need. This is called our sample space. Each of these events has probability 61 . Event Probability one 1 6 1 6 1 6 1 6 1 6 1 6 two three four five six 7 Example: fair six-sided die (cont’d) We can calculate the probabilities of compound events, for example: Event Satisfying outcomes Probability Starts with a consonant. two, three, four, five, six 5 6 Divides evenly by 3. 3, 6 2 6 = 1 3 Has three letters. one, two, six 3 6 = 1 2 Greater than 2. 3,4,5,6 4 6 = 2 3 8 Famous example: Monty Hall problem You’re on a game show, and are given a choice of three doors: behind one door is a new car; behind the others, goats. You pick a door. What is your probability of winning? Let A = “door you picked has the car behind it”. Because there are three doors, only one of which has the car, we say P(A) = 1/3. This gives equal weight to the hypothetical worlds where the car is behind door one, door two or door three. 9 Example: Monty Hall problem (cont’d) Now the twist. The host – who knows what’s behind the doors – opens one of the other two doors to reveal a goat. He asks “would you like to switch doors?” Which strategy do you chose? 10 Example: Monty Hall problem (cont’d) Under the “always switch” policy, all of the instances where you would have won, you lose, and vice versa. So if we switch, we win whenever not-A occurs. We conclude that: P(win if stay) = P(A) = 1/3, so P(win if switch) = P(not-A) = 1 − P(A) = 2/3. 11 Visualizing probability It may be helpful to think about probability visually, as an area. The larger the area, the higher the probability. A B You can think about the random process in terms of throwing darts at a painted target. 12 Visualizing probability (cont’d) Imagine picking a vacation destination by throwing darts at a map. We’d likely end up somewhere rural west of the Mississippi river. 13 Definition: the overlap equation P(A or B) = P(A) + P(B) − P(A & B). A B This formula makes sure we don’t double count the overlap region. What does it mean if P(A & B) = 0? 14 Idea: new events from old We can describe the yellow region in terms of the events A and B as Y = not-(A or B). A B From which we can determine that P(Y ) = 1 − {P(A) + P(B) − P(A & B)}. 15 Idea: new events from old (cont’d) This yellow region can be expressed as Y = A & not-B. A B And we can find that P(Y ) = P(A) − P(A & B)}. 16 Remark: nothing special about rectangles All the same rules apply. B A Similarly, what might “A” and “B” stand for? Why might knowing these rules be useful? 17 Example: the “Linda” problem Linda is 31 years old, single, outspoken, and smart. She was a philosophy major. When a student, she was an ardent supporter of Native American rights, and she picketed a department store that had no facilities for nursing mothers. Rank the following statements in order of probability from 1 (most probable) to 6 (least probable). a Linda is an active feminist. b Linda is a bank teller. c Linda works in a small bookstore. d Linda is a bank teller and an active feminist. e Linda is a bank teller and an active feminist who takes yoga classes. f Linda works in a small bookstore and is an active feminist who takes yoga classes. 18 Example: the “Linda” problem (cont’d) Most respondents didn’t realize that the probability of a conjunction of two events is less than (or equal to) the probability of each of the individual events: P(A & B) ≤ P(A) and P(A & B) ≤ P(B). a Linda is an active feminist. b Linda is a bank teller. c Linda works in a small bookstore. d Linda is a bank teller and an active feminist. e Linda is a bank teller and an active feminist who takes yoga classes. f Linda works in a small bookstore and is an active feminist who takes yoga classes. For example, we can determine that P(e) ≤ P(a) and P(f ) ≤ P(c), irrespective of the background info we have on Linda. 19 Idea: decomposing events into disjoint (mutually exclusive) pieces Why must P(d) ≤ P(a)? a Linda is an active feminist. b Linda is a bank teller. d Linda is a bank teller and an active feminist. We can think of a as being the union of two disjoint groups: feminists who are bank tellers and feminists who are not bank tellers. We find that P(a) = P(a & b) + P(a & not-b), ≥ P(a & b). 20 Idea: decomposing events into disjoint (mutually exclusive) pieces (cont’d) Let a =“feminists” and b =“bank tellers”. a a&b b The blue region is non-bank teller feminists and the cross-hatched region is bank teller feminists. They are both sub-regions of the event a = “feminists” and so necessarily have a smaller area. 21 Definition: the Law of Total Probability The Law of Total Probability P(A) = P(A & B) + P(A & not-B). If the event A can happen in several mutually exclusive ways, to find the overall probability of event A we add up the ways. 22 Definition: the Law of Total Probability (cont’d) We can extend this idea to more than two disjoint events. If A can happen in n mutually exclusive ways we can write The Law of Total Probability P(A) = P(A & B1 ) + · · · + P(A & Bn ), n X = P(A & Bj ). j=1 This gives a slogan for the LoTP: Overall probability is “a sum of separate ways”. 23 Example: different colored coupes For example, let A =“is a two-door vehicle” and Bj denotes different colors; B1 =“red”, B2 =“blue”, etc. The overall probability of two-door cars can be expressed as: P(A) = P(A & B1 ) + · · · + P(A & Bn ), n X = P(A & Bj ). j=1 How many total different colors of car does this equation imply? 24 Jargon: Odds vs. Probability The odds of an event are related to, but distinct from, the probability of the event. The “odds in favor of event A” is defined as P(A) P(A) = . P(not-A) 1 − P(A) The “odds against A” is defined as P(not-A) 1 − P(A) = P(A) P(A) 25 Jargon: Odds vs. Probability (cont’d) In a gambling setting, odds are given as odds against. So if A = “Polson-Pony wins the derby” with P(A) = 0.20 the corresponding odds would be, 1 − P(A) P(not-A) = P(A) P(A) 0.8 = . 0.2 or 4:1. For bookies to make money, the stated odds are typically not the actual probabilities (or even their best guess of them). 26 Example: dutch books Your buddy wants to bet on the Bulls-Pacers game. He sets his odds by judging that each team’s probability of winning is equal to their current winning percentage, 0.50 (6-6) and 0.417 (5-7) respectively. So he gives odds of 1:1 and 7:5. You don’t even follow the NBA, but you jump at the chance, placing two bets with him: $50 on the Pacers and $60 on the Bulls. Event Bulls bet Pacers bet Total profit Bulls win +$60 -$50 +$10 Pacers win -$60 +$70 +$10 No matter what happens you are guaranteed to take his money! What rule did he violate? 27 Topic: random variables A random variable refers to situations where the “event” in question is a numerical measurement; i.e. the number of annual office visits a patient makes, the dollar amount a customer spends, the height of a professional athlete. More formally, a random variable assigns a number to each outcome in a sample space. The simplest example is the venerable coin toss. The outcomes are HEAD or TAIL. We might “code” this as HEAD = 1 and TAIL = 0. See OpenIntro sections 2.4 and 2.6.4. 28 Jargon: dummy variable A random variable that assigns a 1 when some event occurs and a 0 otherwise is called a dummy variable. It is called this because the number 1 “stands in” for the event. Event Value Probability HEAD 1 TAIL 0 1 2 1 2 As before, we assign each outcome a probability, which together constitute the distribution of the random variable. It describes how the total probability mass is distributed across the various outcomes. 29 Example: Bernoulli random variable The distribution of a dummy variable can be described with a single number. (Why only one and not two?) We refer to any such variable as a Bernoulli random variable with parameter p. Event X Probability Obama wins 1 p Romney wins 0 1−p (It is standard to use the letter p in this context, but it is just a place-holder and any other would do as well: q or a or maybe η or ξ if you prefer Greek.) 30 Definition: Bernoulli RVs Event x P(X = x) A 1 p not-A 0 1−p For any parameter p between 0 and 1 this describes a valid probability distribution. It may seem natural to call p here a “variable”, but we avoid the urge because it collides with the “random variable” terminology, so we call it a “parameter” instead. 31 Example: uniform multiple outcomes More generally there can be any number of outcomes. Recall our fair die example. X x p(x) = P(X = x) 1 1 6 1 6 1 6 1 6 1 6 1 6 2 3 4 5 6 Random variables are commonly denoted by capital letters as shorthand for the whole list of possible outcomes. Individual outcomes are referred to by the same letter, but in lower case. 32 Example: general discrete distribution Of course, the probabilities need not be equal. X x P(X = x) -10 √ 2 0.02 20 0.30 40 0.07 0.61 The outcomes can be positive or negative, integers or real numbers, etc. 33 Remark: why RVs? The probability framework does not require the outcomes to be numbers. But working with numerical outcomes is useful: I Many outcomes we care about are already numerical: prices, temperatures, distances, etc. I Even if the outcomes are qualitative, our eventual analysis often assigns costs to these outcomes which are numerical. I Defining compound events is natural when the outcomes are orderable, e.g. P(X ≤ 4) or P(2 ≤ X ≤ 4) or P(10 ≤ X ). I Because numerical outcomes can be ordered, plotting distributions is possible. 34 Plotting distributions 0.6 0.4 0.2 0.0 Probability 0.8 1.0 The rationale behind the term distribution makes good sense pictorially. 1 2 3 4 Value 35 Plotting distributions (cont’d) 0.08 0.04 0.00 Probability 0.12 Plotting is especially helpful when the variable can take many different values, making tables cumbersome. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Value 36 At-bat outcomes We can associate a 0, 1, 2, 3, or 4 to the outcomes of a baseball at-bat. Event x P(X = x) Out 0 0.82 Base hit 1 0.115 Double 2 0.033 Triple 3 0.008 Home run 4 0.024 37 Example: at-bat outcomes We can plot this distribution. 0.4 0.2 0.0 Probability 0.6 0.8 At-bat results 0 1 2 3 4 Bases attained 38 Example: medical expenditures We can “bin” household medical expenditures and think of the distribution over medical expenses. Event Between 0 and $100 Between $100 and $1000 Between $1000 and $5000 x P(X = x) × 10, 000 50 2,600 550 3,300 3K 2,500 Between $5000 and $10,000 7.5K 800 Between $10,000 and $20,000 15K 500 Between $20,000 and $30,000 25K 200 Between $30,000 and $40,000 35K 60 Between $40,000 and $50,000 45K 30 Between $50,000 and $100,000 75K 7 Between $100,000 and $600,000 350K 3 39 Example: daily high temps for Chicago The Midway weather station has records going back to 1929. 0.000 0.005 Probability 0.010 0.015 Chicago Daily High Temps Temperature in Degree Fahrenheit 40 Example: height of NBA players NBA players tend to be very tall. Can we say more than that? 0.04 0.02 0.00 Probability 0.06 0.08 NBA heights 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Height in Inches 41 Jargon: measures of central tendency Although random variables can take many different values it is helpful to think about them having a “tendency” or a general location. So, not all basketball players are tall, but most are. Can we be more precise than this? A not-so-common example would be the mid-range: one-half the difference between the maximum and minimum value. Why might the mid-range not be very informative? We will focus on three common measures of central tendency, in turn: mean, median and mode. 42 Definition: mean By far the most common measure of central tendency is the mean. Mean The mean of a random variable X is defined as E(X ) = J X xj P(X = xj ). j=1 The mean is also called expectation, expected value, arithmetical average, or first moment. 43 Example: mean of a Bernoulli RV We can use the definition to compute the mean of a Bernoulli random variable in terms of its probability parameter p. E(X ) = 0(1 − p) + 1(p) = p. This is the advantage of coding dummy variables as 0 and 1 instead of other arbitrary numbers. 44 Example: at-bats Event x P(X = x) Out 0 0.82 Base hit 1 0.115 Double 2 0.033 Triple 3 0.008 Home run 4 0.024 For our “bases” random variable we can calculate the mean as: E(X ) = J X xj P(X = xj ) j=1 = 0(0.82) + 1(0.115) + 2(0.033) + 3(0.008) + 4(0.024) = 0.301. 45 Example: medical expenditure For our medical costs random variable we can calculate the mean as: E(X ) = J X xj P(X = xj ) j=1 = 50(0.26) + 550(0.33) + 3000(0.25) + 7500(0.08) + 15000(0.05) + 25000(0.02) + 35000(0.006) + 45000(0.003) + 75000(0.0007) + 350000(0.0003), = 3297. 46 Example: NBA heights The calculation is too long to show (but easy with a computer), so we show it pictorially. 0.04 0.00 0.02 Probability 0.06 0.08 NBA heights 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100 Height in Inches The mean is approximately 79 inches (6 foot, 7 inches)! 47 Example: temps Similarly, the mean daily high temperature in Chicago is about 59 degrees. 0.010 0.005 0.000 Probability 0.015 Chicago Daily High Temps Temperature in Degree Fahrenheit 48 Mental image: balancing point You can think of means as “balancing points”. 160.2 0 0.5 72 0.35 900 0.15 In fact, this is where the term “moment” comes from, an analogy with the physics terminology. 49 Definition: median Informally, the median of a random variable is the value where it’s just as likely to see a value below it as it is to see a value above it. Median A random variable X has median m if P(X ≤ m) ≥ 1 1 and P(X ≥ m) ≥ . 2 2 The non-strict inequalities (“less-than OR equal-to”) are important here. There can be more than one! 50 Example: bases To find a median, we can sum up probabilities of outcomes, from smallest to largest, stopping once we get over 12 . Event x Out 0 Base hit 1 Double 2 Triple 3 Home run 4 P(X = x) 820 1000 115 1000 33 1000 8 1000 24 1000 In this case 0 is the median: P(X ≤ 0) = 0.82 and P(X ≥ 0) = 1. 51 Example: Bernoulli RV The median of a Bernoulli random variable can be written in terms of p. ( 0 if p ≤ 21 m(p) = 1 if p > 12 . 52 Examples: weather, NBA height, and medical costs For our other three examples we find: I P(high temp ≤ 60) = 0.5068 and P(high temp ≥ 60) = 0.5057. I P(height ≤ 79) = 0.535 and P(height ≥ 79) = 0.547. I P(med. costs ≤ $550) = 0.59 and P(med. costs ≥ $550) = 0.74. 53 Definition: mode The mode of a distribution is its most likely value. Mode For random variable X , m is a mode of its distribution if P(X = m) ≥ P(X = m0 ) for m0 = 6 m. As with the median, there can be multiple such values. 54 Remark: local versus global modes The mode refers to the globally most likely value. But for distributions with many possible outcomes, we sometimes refer to “local” modes: isolated peaks of the distribution plot. 0.000 0.005 Probability 0.010 0.015 Chicago Daily High Temps Temperature in Degree Fahrenheit Here the global mode is 81 degrees. But there is a second local mode at 37 degrees. We say that this distribution is multimodal. 55 Remark: local versus global modes (cont’d) A distribution with a single mode, like the NBA heights, is said to be unimodal. 0.04 0.00 0.02 Probability 0.06 0.08 NBA heights 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Height in Inches The modal height is 80 inches. 56 mean6=median6=mode As we have already observed, these measures of central tendency differ from one another. variable mean median mode bases 0.301 0 0 med. costs 3,297 550 550 NBA height 78.89 79 80 High temp. 58.8 60 81 Which is a better summary depends on its intended use. Note that the mean does not have to be one of the attainable values. 57 Definition: skewness Skewness The distribution of a random variable X is said to be right skewed if E (X ) m, where m is the median. It is said to be left skewed if E (X ) m. It is not skewed if E (x) ≈ m. Some sources define skewness quantitatively, but we will use this notion qualitatively. 58 Example: skewed medical expenditures Medical expenditures are strongly right skewed. x P(X = x) × 10, 000 50 2,600 Between $100 and $1000 550 3,300 Between $1000 and $5000 3K 2,500 Between $5000 and $10,000 7.5K 800 Between $10,000 and $20,000 15K 500 Between $20,000 and $30,000 25K 200 Between $30,000 and $40,000 35K 60 Between $40,000 and $50,000 45K 30 Between $50,000 and $100,000 75K 7 Between $100,000 and $600,000 350K 3 Event Between 0 and $100 The median is $550 but the mean is about $3,300. The relatively low probability of having a very large expenditure drives up the mean. 59 Skewness examples (cont’d) We can check which of our previous examples exhibited skewness. variable mean median mode skewness bases 0.301 0 0 right med. costs 3,297 550 550 right NBA height 78.89 79 80 none High temp. 58.8 60 81 none 60 Idea: dispersion Measures of central tendency are not the whole story. The variability about the trend can also be important. Informally, we call this dispersion. To see that neither mean, nor median, nor mode, nor skewness tell the whole story, consider a symmetric, unimodal distribution, one for which the mean, median and mode are all the same. Measuring how variable random outcomes are can be very important practically. Can you think of any examples? 61 Definition: variance A common measure of the spread of a random variable is the variance. Variance The variance of a random variable X with distribution p(x) is defined as V(X ) = J X 2 (xj − E(X )) p(xj ). j=1 See OpenIntro page 107, equation 2.72. 62 Definition: standard deviation The standard deviation is the square-root of the variance. Standard deviation The standard p deviation of a random variable X with variance V(X ) is given by V (X ). The standard deviation has the advantage that it has the same units as the random variable itself. 63 Example: Bernoulli RVs We find the variance is of a Bernoulli RV with parameter p is V (X ) = (0 − p)2 (1 − p) + (1 − p)2 p = p 2 (1 − p) + (1 − p)2 p = (p 2 + (1 − p)p)(1 − p) = p(1 − p) The standard deviation is therefore p p(1 − p)). 64 Example: bases earned Event x p(x) Out 0 0.82 Base hit 1 0.115 Double 2 0.033 Triple 3 0.008 Home run 4 0.024 For our “bases” random variable we can calculate the variance as: V(X ) = J X (xj − E(X ))2 p(xj ) j=1 = (0 − 0.301)2 (0.82) + (1 − 0.301)2 (0.115)+ (2 − 0.301)2 (0.033) + (3 − 0.301)2 (0.008)+ (4 − 0.301)2 (0.024) = 0.6124. The standard deviation is then √ 0.6124 = 0.782. 65 Topic: statistical prediction Suppose we are tasked with predicting some random event. Assuming that we know the distribution of the random variable in question, how should we make our prediction? The answer depends on how we judge the goodness of our predictions. We do this by defining a utility function. Then, we figure out what prediction (action) gives us the best expected utility — the best long run average utility. 66 Definition: expected value An expectation is a probability-weighted sum taken over the possible values of a random variable. Expectation The expectation or expected value of a function g (X ) is defined as: E (g (X )) = J X g (xj )p(xj ). j=1 The mean is the expected value of the identity function g (x) = x. See OpenIntro 2.4.1. 67 Aside: computational shortcut for variance A convenient way to compute the variance is via the identity V(X ) = J X 2 (xj − E(X )) p(xj ), j=1 = E(X 2 ) − E(X )2 . In words: the variance is the “expected value of the square minus the square of the expected value.” 68 Properties of expectation Here are some rules that make working with calculating expectations easier. Properties of expectation operator For any random variables X and Y and any constant number c the following properties hold. I E(X + c) = E(X ) + c. I E (X + Y ) = E (X ) + E (Y ). E(cX ) = cE(X ). I The facts are not hard to show directly from the definition. (We will make sense of an expression like X + Y in lecture 3.) 69 Expected utility Denote your utility of action a by u(a, x) for a given value x (of random variable X ). Maximum expected utility principle Among all possible actions a, chose the action a∗ that maximizes E (u(a, X )). You will not necessarily win the most often, but you will in aggregate/in the long-run get the most utility. 70 Example: chuck-a-luck It costs $1 to play the following gambling game where the payoff depends on the outcome of a single roll of a six-sided die. If you roll a 1,2,3 or 4, you win the dollar amount of the number rolled. So if you roll a 1, you get your $1 back. If you roll a 2, you make a dollar, etc. If you roll a 5 or a 6, you win nothing (and so lose a dollar overall). What is the expected value of the game? Is it worth playing? 71 Example: chuck-a-luck (version two) Now the rules are reversed. If you roll a 1,2,3 or 4, you win nothing. If you roll a 5 or a 6, you win the corresponding dollar amount. How much would you be willing to play this game? 72 Example: predicting milk demand Suppose you own a cafe. 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Gallons of Milk Needed 1 2 3 4 5 6 7 8 9 10 On any given day you run through between one and ten gallons of milk, with the shown probabilities. You have to order your daily milk delivery in advance: how many should you order? 73 Example: predicting milk demand (cont’d) Buying too much milk and not having enough are not necessarily equally bad. If you buy too much, you overpaid for unneeded milk. Let’s say this costs us $5 a gallon. On the other hand, for every gallon of milk you end up needing, but don’t have, you lose (say) five customers who wanted lattes. Not only do you forfeit the profit from the latte, but you forfeit the chips and John Mayer CD they might have bought too if they didn’t end up going to the cafe across the street. Let’s put this loss at $35 a gallon. This reasoning suggests it is generally better to have milk and not need it, than to need milk and not have it. Can we quantify this? 74 Example: predicting milk demand (cont’d) Our milk demand random variable is: x P(X = x) 1 4% 2 15 % 3 35% 4 5% 5 5% 6 5% 7 5% 8 20% 9 3% 10 . 3% Our utility function is ( −$5(a − x) u(a, x) = −$35(x − a) if a > x, if x > a. where the action a is the number of gallons we order and the “state” x is the amount of milk required. We must now compute E(u(a, X )) for each possible value of a = 1, . . . , 10 and order the number of gallons for which this is largest. 75 Example: predicting milk demand (cont’d) Let’s work through an example. For a = 2 we have E(u(a, X )) = 10 X u(2, x)P(X = x) x=1 = −5(0.04) + 0(0.15) − 35(0.35) − 2(35)(0.05) − 3(35)(0.05) − 4(35)(0.05) − 5(35)(0.05) − 6(35)(0.2) − 7(35)(0.03) − 8(35)(0.03) = −$94.7 76 Example: predicting milk demand (cont’d) Computing the rest is easy with a computer: The results are a 1 2 3 4 5 6 7 8 9 10 . E (u(a, X )) -128.1 -94.7 -67.3 -53.9 -42.5 -33.1 -25.7 -20.3 -22.9 -26.7 We order 8 gallons. 77 Food for thought: doctors versus patients Consider a scenario where a patient is offered two treatments for a deadly disease. The first treatment works 60% of the time, and the second treatment works only 50% of the time. What is the doctor’s rationale for recommending the first treatment? Can the patient appeal to a similar rationale? One of the most important things you can learn in this class is to distinguish between one shot versus repeated decision scenarios. That is, are you the doctor or the patient? If you are the doctor, you can use the laws of probability to calculate your optimal statistical decision. 78