Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 3 Probability(概率) The Concept of Probability Sample Spaces and Events Some Elementary Probability Rules Conditional Probability and Independence Section 3.1 The Concept of Probability An experiment is any process of observation with an uncertain outcome. --- On any single trial of the experiment, one and only one of the possible outcomes will occur. The possible outcomes for an experiment are called the experimental outcomes Probability is a measure of the chance that an experimental outcome will occur when an experiment is carried out Probability and statistics 3 Example 3.1 Roll a die. The experimental outcomes are 1, 2, 3, 4, 5, and 6. An Outcome is the particular result of an experiment. An Event is the collection of one or more outcomes of an experiment. Possible outcomes: The numbers 1, 2, 3, 4, 5, 6 One possible event: The occurrence of an even number. That is, we collect the outcomes 2, 4, and 6. Regardless of the method used, probabilities must be assigned to the experimental outcomes so that two conditions are met: Conditions 1. 0 P(E) 1 such that: If E can never occur, then P(E) = 0 If E is certain to occur, then P(E) = 1 2. The probabilities of all the experimental outcomes must sum to 1 Section 3.2 Sample Spaces and Events(事件) Sample space (S)(样本空间): The sample space is defined as the set of all possible outcomes of an experiment. e.g. All 6 faces of a die: e.g. All 52 cards of a bridge deck: Example 3.2 Genders of Two Children Let: B be the outcome that child is boy. G be the outcome that child is girl. Sample space S: S = {BB, BG, GB, GG} If B and G are equally likely , then P(B) = P(G) = ½ and P(BB) = P(BG) = P(GB) = P(GG) = ¼ Recall example 3.2: Genders of Two Children An event is a set of sample space outcomes. Events P(one boy and one girl) = P(BG) + P(GB) = ¼ + ¼ = ½. P(at least one girl) =P(BG) + P(GB) + P(GG) = ¼ + ¼ + ¼ = ¾. Note: Experimental Outcomes: BB, BG, GB, GG All outcomes equally likely: P(BB) = … = P(GG) = ¼ Example 3.3 Answering Three True-False Questions A student takes a quiz that consists of three true-false questions. Let C and I denote answering a question correctly and incorrectly, respectively. The graph on the next slide shows the sample space outcomes for the experiment. The sample space consists of 8 outcomes: CCC CCI CIC CII ICC ICI IIC III Suppose the student is totally unprepared for the quiz and has to blindly guess the answers. That is, the student has a 50-50 chance of correctly answering each question. So, each of the 8 outcomes is equally likely to occur. P(CCC)=P(CCI)= ... = P(III)=1/8. Probabilities: Equally Likely Outcomes If the sample space outcomes (or experimental outcomes) are all equally likely, then the probability that an event will occur is equal to the ratio: the number of outcomes that correspond to the event The total number of outcomes Basic Computation of Probabilities The probability of an event is also equal the sum of the probabilities of the sample space outcomes that correspond to the event. Example 3.4 The probability that the student will get exactly two questions correct is P(CCI) + P(CIC) + P(ICC) = 1/8 + 1/8 + 1/8 = 3/8. The probability that the student will get at least two questions correct is P(CCC) + P(CCI) + P(CIC) + P(ICC) = 1/8 + 1/8 + 1/8 + 1/8 = 1/2. Relative Frequency Method(概率的频率解释) Let E be an outcome of an experiment. If the experiment is performed many times, P(E) is the relative frequency of E. P(E) is the percentage of times E occurs in many repetitions of the experiment. Use sampled or historical data to calculate probabilities. Example 3.5 Suppose that of 1000 randomly selected consumers, 140 preferred brand X. The probability of randomly picking a person who prefers brand X is 140/1000 = 0.14 or 14%. Example2: Long-Run Relative Frequency Long-Run Relative Frequency Method Method: Example 1. An accounts receivable manager knows from past data that about 70 of 1000 accounts became uncollectible. The manager would estimate the probability of bad debts as 70/1000 = .07 or 7%. 2. Tossing a fair coin 3000 times, we can see that although the proportion of heads was far from 0.5 in the first 100 tosses, it seemed to stabilize and approach 0.5 as the number of tosses increased. Long-Run Relative Frequency Method: application Often we determine the probability from a random sample (Long-Run Relative Frequency Method) and apply it to the population. Of 5528 Zhuhai residents randomly sampled, 445 prefer to watch CCTV-1 Estimated Share P(CCTV-1) = 445 / 5528 = 0.0805 So the probability that any Zhuhai resident chosen at random prefers CCTV-1 is 0.0805 Assuming total population in Zhuhai is 1,000,000 : Size of audience in the city = Population x Share so 1,000,000 x 0.0805 = 80,500 Subjective Probability Using experience, intuitive judgment, or personal expertise to assess/derive a probability May or may not have relative frequency interpretation (Some events cannot be repeated many times) Contains a high degree of personal bias. What is the probability of your favorite basketball or football team win the next game? (e.g. sports betting) Subjective probability & betting The odds in betting reflect the subjective probability guessed by the mass. Who much are you willing to pay for a ticket which worth $10 if there was life on Mars and nothing if there was not? Subjective probability usually reflects the mind/opinion more than the reality. Sometimes, it is used to gauge the public opinions. Section 3.3 Some Elementary Probability Rules The complement A of an event A is the set of all sample space outcomes not in A. Further, P(A) = 1 - P(A). These figures are “Venn diagrams”. Union of A and B, A B (A和B的并集) Is an event consisting of the outcomes that belong to either A or B (or both). Intersection of A and B, A B (A和B的交集) Is an event consisting of the outcomes that belong to both A and B. The Addition Rule(加法准则) The probability that A or B (the union of A and B) will occur is P(A B) = P(A) + P(B) - P(A B) where P(A B) is the “joint” probability of A and B, i.e., both occurring. A and B are mutually exclusive(相互排斥) if they have no sample space outcomes in common, or equivalently, if P(A B) = 0. If A and B are mutually exclusive, then P(A B)=P(A)+P( B). Example 3.7 Newspaper Subscribers #1 Define events: A = event that a randomly selected household subscribes to the Atlantic Journal. B = event that a randomly selected household subscribes to the Beacon News. Given: total number in city, N = 1,000,000 number subscribing to A, N(A) = 650,000 number subscribing to B, N(B) = 500,000 number subscribing to both, N(A∩B) = 250,000 Newspaper Subscribers #2 Use the relative frequency method to assign probabilities 650,000 P A 0.65 1,000,000 500,000 P B 0.50 1,000,000 250,000 P A B 0.25 1,000,000 Table3.1 A Contingency Table(列联表) Subscription Data for the Atlantic Journal and the Beacon News Events Subscribes to Does Not Beacon News, Subscribe to B Beacon News, Total Subscribes to Atlantic Journal, A 250,000 400,000 650,000 Does not Subscribes to Atlantic Journal, 250,000 100,000 350,000 Total 500,000 500,000 1,000,000 Newspaper Subscribers #3 Refer to the contingency table in Table 3.1 for all probabilities For example, the chance that a household does not subscribe to either newspaper Calculate PA B , so from middle row and middle column of Table 3.1, 100,000 P A B 0.10. 1,000,000 Newspaper Subscribers #4 The chance that a household subscribes to either newspaper: P(A B)=P(A)+P ( B ) P ( A B ) 0.65 0.50 0.25 0.90. Note that if the joint probability was not subtracted, then we would have gotten 1.15, greater than 1, which is absurd. Note: The subtraction avoids double counting the joint probability. A Mutually Exclusive Case Recall the radio station example. The percentages of LA residents who favor each of the top 10 stations is listed in the Figure. Let the name of each station, for example KPWR, represent the event that the station, say KPWR, is the most favorable radio station for a randomly selected resident. Since the survey asked each resident to name the single station that he/she listens to most, the 10 events are mutually exclusive. Therefore, the probability that a randomly selected LA resident would favor one of the top 10 stations is P(KPWR U KLAX U …… U KSBC-FM) = P(KPWR)+P(KLAX)+……+P(KCBS-FM) = 0.08+0.064+ …….+0.036=0.508. Section 3.4 Conditional Probability and Independence The probability of an event A, given that the event B has occurred, is called the “conditional probability of A given B”(条件概率) and is denoted as Further, P(A B) P(A| B) = P(B) Assume that P(B) is greater than 0. Interpretation: Restrict the sample space to just event B. The conditional probability P(A|B) is the chance of event A occurring in this new sample space. Similarly, if A occurred, then what is the chance of B occurring? To answer this question, we need to introduce the probability of event B, given that the event A has occurred, i.e., the conditional probability of B given A, denoted by P(B|A). P(A B) P(B | A) = P(A) Assume that P(A) is greater than 0. Newspaper Subscribers Given that the households that subscribe to the Atlantic Journal, what is the chance that they also subscribe to the Beacon News? Calculate P(B|A), where P A B P B | A P A 0.25 0.3846. 0.65 Independence(独立) of Events Two events A and B are said to be independent if and only if P(A|B) = P(A) or, equivalently, P(B|A) = P(B). That is, if the chance of event A occurring is not influenced by whether the event B occurs and vice versa; or if the occurrences of the events A and B have nothing to do with each other, then A and B are independent. In fact if one of the above two equations holds, so does the other, why? Newspaper Subscribers Given that the households that subscribe to the Atlantic Journal subscribers, what is the chance that they also subscribe to the Beacon News? If independent, the P(B|A) = P(B). Is P(B|A) = P(B)? Know that P(B) = 0.50. Just calculated that P(B|A) = 0.3846. 0.50 ≠ 0.3846, so P(B|A) ≠ P(B). B is not independent of A. A and B are said to be dependent. The Multiplication Rule The joint probability that A and B (the intersection of A and B) will occur is P(A B) = P(A) P(B|A) = P(B) P(A|B). If A and B are independent, then the probability that A and B (the intersection of A and B) will occur is P(A B) = P(A) P(B) P(B) P(A). A Question Suppose in the following contingency table, where the numbers represent probabilities, some data are lost. 1.Can you recover the missing data? 2.Are events R and C independent? R R Total C .4 C .3 .5 Total .6 1.00 Contingency Tables P(R ) P(R C ) R R Total C .4 .1 .5 P(R C ) As P( R C ) 0.4 C .2 .3 .5 Total .6 .4 1.00 P( C ) P ( R) P (C ) 0.6 0.5 0.3 P ( R C ) P ( R ) P(C ) the events R and C are dependent. Chapter 4 Discrete Random Variables(离 散随机变量) Two Types of Random Variables Discrete Probability Distributions The Binomial Distribution The Poisson Distribution Random Variables (随机变量) A random variable is a variable that assumes numerical values that are determined by the outcome of an experiment. Example 4.1 Consider a random experiment in which a coin is tossed three times. Let X be the number of heads. Let H represent the outcome of a head and T the outcome of a tail. The possible outcomes for such an experiment: TTT, TTH, THT, THH, HTT, HTH, HHT, HHH Thus the possible values of X (number of heads) are 0,1,2,3. From the definition of a random variable, X as defined in this experiment, is a random variable. Section 4.1 Two Types of Random Variables Discrete random variable(离散型随机变量): Possible values can be counted or listed - For example, the number of TV sets sold at the store in one day. Here x could be 0, 1, 2, 3, 4 and so forth. Continuous random variable (连续型随机变量): May assume any numerical value in one or more intervals - For example, the waiting time for a credit card authorization, the interest rate charged on a business loan Example: Two Types of Random Variables Question Random Variable x Type Family size x = Number of people in family reported on tax return Discrete(离散) Distance from home to store x = Distance in miles from home to a store Continuous(连续) Own dog or cat x = 1 if own no pet; = 2 if own dog(s) only; = 3 if own cat(s) only; = 4 if own dog(s) and cat(s) Discrete Section 4.2Discrete Probability Distributions(离散概率分布) The probability distribution of a discrete random variable is a table, graph, or formula that gives the probability associated with each possible value that the variable can assume Notation: Denote the values of the random variable by x and the value’s associated probability by p(x) Properties 1. For any value x of the random variable, p(x) 0 2. The probabilities of all the events in the sample space must sum to 1, that is, px 1 all x Example 4.2 Number of Radios(Sold at South City in a Week) Let x be the random variable of the number of radios sold per week, x has values x = 0, 1, 2, 3, 4, 5 Given sales history over past 100 weeks Let f be the number of weeks (of the past 100) during which x number of radios were sold Records tell us that f(0)=3 No radios have been sold in 3 of the weeks f(1)=20 One radios has been sold in 20 of the weeks f(2)=50 Two radios have been sold in 50 of the weeks f(3)=20 Three radios have been sold in 20 of the weeks f(4)=5 Four radios have been sold in 4 of the weeks f(5)=2 Five radios have been sold in 2 of the weeks No more than five radios were sold in any of the past 100 weeks Frequency distribution of sales history over past 100 weeks # Radios, x 0 1 2 3 4 5 Frequency f(0) =3 f(1) =20 f(2) =50 f(3) =20 f(4) = 5 f(5) = 2 100 Relative Frequency Probability, p(x) 3/100 = 0.03 p(0) = 0.03 20/100 = 0.20 p(1) = 0.20 0.50 p(2) = 0.50 0.20 p(3) = 0.20 0.05 p(4) = 0.05 0.02 P(5) = 0.02 1.00 1.00 Interpret the relative frequencies as probabilities So for any value x, f(x)/n = p(x) Assuming that sales remain stable over time What is the chance that two radios will be sold in a week? P(x = 2) = 0.50 What is the chance that fewer than 2 radios will be sold in a week? p(x < 2) = p(x = 0 or x = 1) Using the addition rule for the mutually = p(x = 0) + p(x = 1) exclusive values of = 0.03 + 0.20 = 0.23 the random variable. What is the chance that three or more radios will be sold in a week? p(x ≥ 3) = p(x = 3, 4, or 5) = p(x = 3) + p(x = 4) + p(x = 5) = 0.20 + 0.05 + 0.02 = 0.27 Expected Value of a Discrete Random Variable The mean(均值) or expected value of a discrete random variable X is: X x p x All x is the value expected to occur in the long run and on average Example 4.3 Number of Radios How many radios should be expected to be sold in a week? Calculate the expected value of the number of radios sold, X Radios, x 0 1 2 3 4 5 Probability, p(x) p(0) = 0.03 p(1) = 0.20 p(2) = 0.50 p(3) = 0.20 p(4) = 0.05 p(5) = 0.02 1.00 x p(x) 0 0.03 = 0.00 1 0.20 = 0.20 2 0.50 = 1.00 3 0.20 = 0.60 4 0.05 = 0.20 5 0.02 = 0.10 2.10 • On average, expect to sell 2.1 radios per week Variance and Standard Deviation The variance of a discrete random variable is: 2X x X 2 px All x • The variance is the average of the squared deviations of the different values of the random variable from the expected value The standard deviation is the square root of the variance X 2X • The variance and standard deviation measure the spread of the values of the random variable from their expected value Example 4.4 Number of Radios Calculate the variance and standard deviation of the number of radios sold at Sound City in a week Radios, x 0 1 2 3 4 5 Probability, p(x) p(0) = 0.03 p(1) = 0.20 p(2) = 0.50 p(3) = 0.20 p(4) = 0.05 p(5) = 0.02 1.00 Standard deviation X 0.89 0.9434 (x - X)2 p(x) (0 – 2.1)2 (0.03) = 0.1323 (1 – 2.1)2 (0.20) = 0.2420 (2 – 2.1)2 (0.50) = 0.0050 (3 – 2.1)2 (0.20) = 0.1620 (4 – 2.1)2 (0.05) = 0.1805 (5 – 2.1)2 (0.02) = 0.1682 0.8900 Variance X2 0.89 The Binomial Distribution (二项分布) The Binomial Experiment: 1. Experiment consists of n identical trials 2. Each trial results in either “success” or “failure” 3. Probability of success, p, is constant from trial to trial 4. Trials are independent Note: The probability of failure, q, is 1 – p and is constant from trial to trial If x is the total number of successes in n trials of a binomial experiment, then x is a binomial random variable The Binomial Distribution #2 For a binomial random variable x, the probability of x successes in n trials is given by the binomial distribution: n! px = p x q n- x x!n - x ! • Note: n! is read as “n factorial” and n! = n × (n-1) × (n-2) × ... × 1 – For example, 5! = 5 4 3 2 1 = 120 • Also, 0! =1 • Factorials are not defined for negative numbers or fractions The Binomial Distribution #3 • What does the equation mean? – The equation for the binomial distribution consists of the product of two factors n! px = x!n - x ! Number of ways to get x successes and (n–x) failures in n trials p x q n- x The chance of getting x successes and (n–x) failures in a particular arrangement Example 4.5 Incidence of Nausea The company claims that, at most, 10 percentage of all patients treated with Phe-Mycin would experience nausea as a side effect of taking the drug. x = number of patients who will experience nausea following treatment with Phe-Mycin out of the 4 patients tested Find the probability that 2 of the 4 patients treated will experience nausea Given: n = 4, p = 0.1, 4! 0.1 2 0.9 42 with x = 2 px 2 2!4 2! Then: q = 1 – p = 1 – 60.1 2 0.9 2 0.0486 0.1 = 0.9 Binomial Distribution (n = 4, p = 0.1) Binomial Probability Table (Appendix Table A.1, P817) Table 4.7(a) for n = 4, with x = 2 and p = 0.1 p = 0.1 values of p (.05 to .50) x 0 1 2 3 4 0.05 0.8145 0.1715 0.0135 0.0005 0.0000 0.95 0.1 0.6561 0.2916 0.0486 0.0036 0.0001 0.9 0.15 0.5220 0.3685 0.0975 0.0115 0.0005 0.85 … … … … … … … 0.50 0.0625 0.2500 0.3750 0.2500 0.0625 0.50 values of p (.05 to .50) P(x = 2) = 0.0486 4 3 2 1 0 x Example 4.5 Incidence of Nausea(after Treatment) x = number of patients who will experience nausea following treatment with Phe-Mycin out of the 4 patients tested Find the probability that at least 3 of the 4 patients treated will experience nausea Set x = 3, n = 4, p = 0.1, so q = 1 – p = 1 – 0.1 = 0.9 Then: p x 3 p x 3 or 4 p x 3 p x 4 0.0036 .0001 0.0037 Using the addition rule for the mutually exclusive values of the binomial random variable Rare Events Suppose at least three of four sampled patients actually did experience nausea following treatment If p = 0.1 is believed, then there is a chance of only 37 in 10,000 of observing this result So this is very unlikely! But it actually occurred So, this is very strong evidence that p does not equal 0.1 There is very strong evidence that p is actually greater than 0.1 Several Binomial Distributions Mean and Variance of a Binomial Random Variable If x is a binomial random variable with parameters n and p (so q = 1 – p), then mean X np variance X2 npq standard deviation X npq Back to Example 4.5 Of 4 randomly selected patients, how many should be expected to experience nausea after treatment? Given: n = 4, p = 0.1 Then mX = np = 4 0.1 = 0.4 So expect 0.4 of the 4 patients to experience nausea If at least three of four patients experienced nausea, this would be many more than the 0.4 that are expected Binomial Distribution EXAMPLE: Pat Statsdud is registered in a statistics course and intends to rely on luck to pass the next quiz. The quiz consists on 10 multiple choice questions with 5 possible choices for each question, only one of which is the correct answer. Pat will guess the answer to each question Find the following probabilities Pat gets no answer correct Pat gets two answer correct? Pat fails the quiz If all the students in Pat’s class intend to guess the answers to the quiz, what is the mean and the standard deviation of the quiz mark? Solution Checking the conditions An answer can be either correct or incorrect. There is a fixed finite number of trials (n=10) Each answer is independent of the others. The probability p of a correct answer (.20) does not change from question to question. Determining the binomial probabilities: Let X = the number of correct answers 10! P( X 0) (.20 ) 0 (.80 )100 .1074 0! (10 0)! 10! P( X 2) (.20 ) 2 (.80 )10 2 .3020 2! (10 2)! Determining the binomial probabilities: Pat fails the test if the number of correct answers is less than 5, which means less than or equal to 4. P(X4 = p(0) + p(1) + p(2) + p(3) + p(4) = .1074 + .2684 + .3020 + .2013 + .0881 =.9672 The mean and the standard deviation of the quiz mark? μ= np = 10(.2) = 2. σ= [np(1-p)]1/2 = [10(.2)(.8)]1/2 = 1.26 The Poisson Distribution (泊松分布) Consider the number of times an event occurs over an interval of time or space, and assume that 1. The probability of occurrence is the same for any intervals of equal length 2. The occurrence in any interval is independent of an occurrence in any non-overlapping interval If x = the number of occurrences in a specified interval, then x is a Poisson random variable The Poisson Distribution Continued Suppose is the mean or expected number of occurrences during a specified interval The probability of x occurrences in the interval when are expected is described by the Poisson distribution: e x px x! where x can take any of the values x = 0, 1, 2, 3, … and e = 2.71828… (e is the base of the natural logs) Example 4.6 ATC Center Errors Suppose that an air traffic control (ATC) center has been averaging 20.8 errors per year and lately the center experiences 3 errors in a week. Let x be the number of errors made by the ATC center during one week Given: = 20.8 errors per year Then: = 0.4 errors per week • Because there are 52 weeks per year, m for a week is: = (20.8 errors/year) / (52 weeks/year) = 0.4 errors/week ATC Center Errors Continued Find the probability that 3 errors (x =3) will occur in a week – Want p(x = 3) when = 0.4 e 0.4 0.43 px 3 0.0072 3! Find the probability that no errors (x = 0) will occur in a week – Want p(x = 0) when = 0.4 e 0.4 0.40 px 0 0.6703 0! Poisson Probability Table (Appendix Table A.2, P821) , Mean number of Occurrences x 0 1 2 3 4 5 0.1 0.9048 0.0905 0.0045 0.0002 0.0000 0.0000 0.2 0.8187 0.1637 0.0164 0.0011 0.0001 0.0000 … … … … … … … 0.4 0.6703 0.2681 0.0536 0.0072 0.0007 0.0001 … … … … … … … e 0.4 0.43 px 3 0.0072 3! =0.4 1.00 0.3679 0.3679 0.1839 0.0613 0.0153 0.0031 Poisson Distribution ( = 0.4) Mean and Variance of a Poisson Random Variable If x is a Poisson random variable with parameter , then mean X variance 2 X standard deviation X Several Poisson Distributions Back to Example 4.6 In the ATC center situation, 28.0 errors occurred on average per year Assume that the number x of errors during any span of time follows a Poisson distribution for that time span Per week, the parameters of the Poisson distribution are: • mean = 0.4 errors/week • standard deviation = 0.6325 errors/week. • Because = √0.4 = 0.6325 Poisson Distribution Example Customers arrive at a rate of 72 per hour. What is the probability of 4 customers arriving in 3 minutes? Solution: 72 per hr. = 1.2 per min. = 3.6 per 3 mins. px e x! x e 3.6 3.6 px 4 0.1912 4! 4