Download Statistics 510: Notes 1

Statistics 510: Notes 2 Reading: Sections 2.3, 2.4, 2.7. I. Wrap-up of Section 2.2 Example 6 from last class: A fashionable country club has 100 members, 30 of whom are lawyers. Rumor has it that 25 of the club members are liars and that 55 are neither lawyers nor liars. What proportion of the lawyers are liars? Let A  set of lawyers, B  set of liars and S  set of all members of the country club. Let the number of members in any set Q be denoted by N (Q) . The proportions of the lawyers that are liars is N ( A  B) equal to N ( A) . We are given that N ( S )  100 N ( A)  30 N ( B)  25 N (( A  B )C )  55 The last statement implies that N ( A  B)  100  55  45 . To use this information to calculate N ( A  B) , we verify using a Venn diagram that N ( A  B)  N ( A)  N ( B)  N ( A  B) . Thus, N ( A  B)  30  25  45  10 and the proportion of lawyers N ( A  B) 10 1 that are liars is N ( A)  30  3 . DeMorgan’s Laws: Let A and B denote any two events. Use Venn diagrams to show that (a) the complement of their intersection is the union of their complements: ( A  B)C  AC  BC (b) the complement of their union is the intersection of their complements: ( A  B)C  AC  BC Review of Section 2.2: We have defined the key concepts of an experiment, the sample space for an experiment and events in the sample space. We have discussed relations between events and introduced the Venn diagram as a tool for examining the relations between events. The relations between events will be useful for manipulating probabilities. We now introduce the concept of the probability of an event. II. Frequency interpretation of probability (Section 2.3) The relative frequency of an event is a proportion measuring how often, or how frequently, the event occurs in a sequence of experiments. Example 1: Experiment: Toss a coin. Sample space is S  {heads, tails} . If the experiment is repeated many times, the relative frequency of heads will usually be close to ½:  The French naturalist Count Buffon (1707-1788) tossed a coin 4040 times. Result: 2048 heads, or relative frequency 2048/4040=0.5069 for heads.  Around 1900, the English statistician Karl Pearson heroically tossed a coin 24,000 times. Result: 12,012 heads, a relative frequency of 0.5005.  While imprisoned by the Germans during World War II, the Australian mathematician John Kerrich tossed a coin 10,000 times. Result: 5067 heads, a relative frequency of 0.5067. In the frequency interpretation of probability, the probability of an event A is the expected relative frequency of A in a large number of trials. In symbols, the proportion of times A occurs in n trials, call it Pn ( A) , is expected to be roughly equal to the theoretical probability P( A) if n is large: Pn ( A)  P( A) for large n . Example 2: Experiment: Observation of the sex of a child. The sample space is S  {girl , boy} . The following table shows the proportion of boys among live births to residents of the U.S.A. over the past 20 years (Source: Information Please Almanac). Year 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 Number of births 3,638,933 3,669,141 3,760,561 3,756,547 3,809,394 3,909,510 4,040,958 4,158,212 4,110,907 4,065,014 4,000,240 3,952,767 3,926,589 Proportion of boys 0.5126648 0.5122425 0.5126849 0.5124035 0.5121951 0.5121931 0.5121286 0.5121179 0.5112054 0.5121992 0.5121845 0.5116894 0.5084196 1996 1997 1998 1999 2000 2001 2002 3,891,494 3,880,894 3,941,553 3,959,417 4,058,814 4,025,933 4,021,726 0.5114951 0.5116337 0.5115255 0.5119072 0.5117182 0.5111665 0.5117154 The relative frequency of boys among newborn children in the U.S.A. appears to be stable at around 0.512. This suggests that a reasonable model for the outcome of a single birth is P(boy )  0.512 and P( girl )  0.488 . This model for births is equivalent to the sex of a child being determined by drawing at random with replacement from a box of 1000 tickets, containing 512 tickets marked boy and 488 tickets marked girl . III. Axioms of Probability (Section 2.3) The frequency interpretation of probability is the way that many scientists think about what probability represents but it is hard to make it into a rigorous mathematical definition of probability. Kolmogorov (1933) developed an axiomatic definition of probability which he then showed can be interpreted, in a certain sense, as the limit of the relative frequency in a large number of experiments. A probability function (measure) on the events in a sample space is a function on the events P ( E ) that satisfies the following three axioms: Axiom 1: 0  P ( E )  1 for all events E . Axiom 2: P( S )  1 where S is the sample space. Axiom 3: For any sequence of mutually exclusive events E1 , E2 , (that is, events for which Ei  E j   when i  j ),  P( i 1  Ei )   P ( Ei ) . i 1 We refer to P ( E ) as the probability of an event E . Using these axioms, we shall be able to prove that if an experiment is repeated over and over again, then with probability 1, the proportion of times that a specific event E occurs converges to P ( E ) , which is essentially the frequency interpretation of probability. This is called the strong law of large numbers and we shall prove it in Chapter 8. Consequences of axioms: 1. P()  0 . Proof: Consider the sequence of events E1 , E2 , , where E1  S and Ei   for i  1 . Then, as the events are mutually exclusive and as S   i 1 Ei , we have from Axiom 3 that   i 1 i 2 P( S )   P( Ei )  P( S )   P() , implying that P()  0 . 2. For any finite sequence of mutually exclusive events E1 , , En , n P( i 1 n Ei )   P ( Ei ) . i 1 Proof: Let Ei   for i  n . The results follows from Axiom 3 combined with the fact established above that P()  0 . IV. Examples of probability functions Example 3: If a die is rolled and we suppose that all six sides are equally likely to appear, then we would have P({1})  P({2})  P({3})  P({4})  P({5})  P({6})  The probability of rolling an even number would equal, from Axiom 3, 1 P({2, 4, 6})  P({2})  P({4})  P({6})  . 2 1 6. Example 4: A die is loaded in such a way that the probability of any particular face’s showing is directly proportional to the number on that face. What is the probability that an even number appears? To solve this requires that we make use of Axiom 2 that P( S )  1 . The experiment – tossing a die – generates a sample space containing six outcomes. But the six are not equally likely: by assumption, P(" i " face appears)  P(i )  ki, i  1, , 6 where k is a constant. From Axiom 2, 6 6 6(6  1) P (" i " face appears)  ki  k  21k  1 ,   2 i 1 i 1 i P (" i " face appears)  which implies that k  1/ 21and 21 . It follows then from Axiom 3 that the probability that an even number appears is 2 4 6 12 P(even number)  P(2)  P(4)  P(6)     21 21 21 21 V. Probability as a Measure of Belief (Section 2.7) Another interpretation of probability, besides the frequency interpretation, is that probability measures an individual’s belief in the statement that he or she is making. This is called subjective or personal probability. Consider the question, “What is the probability that the Philadelphia Eagles will win the Super Bowl this year?” It is hard to interpret such a probability using the frequency interpretation because the football season can only be played once. The subjective interpretation of a statement that the Eagles have a probability of 0.1 of winning the Super Bowl is that:  If the person making the statement were offered a chance to play a game in which the person was required to pay less than 10 cents to buy into the game and would win $1 if the Eagles win the Super Bowl, then the person would buy into the game.  By contrast, if the person making the statement were offered a chance to play a game in which the person was required to pay more than 10 cents to buy into the game and would win $1 if the Eagles win the Super Bowl, then the person would not buy into the game. More generally, if E is an event, a person’s subjective probability of P ( E ) has the following interpretation: For a game in which the person will be paid $1 if E occurs, P ( E ) is the amount of money the person would be willing to pay to buy into the game. Thus, if the person is willing to pay 50 cents to buy in, P( E )  .5 . Note that this concept of probability is personal: P ( E ) may vary from person to person depending on their opinions. A rational person has a “coherent” system of personal probabilities: a system is said to be “incoherent” if there exists some structure of bets such that the bettor will lose no matter what happens. It can be shown that a coherent system of personal probabilities requires that the personal probabilities satisfy Axioms 1, 2 and 3 (for details on this, see Hogg, McKean and Craig, Introduction to Mathematical Statistics, Chapter 11.1). Thus, whether the probability function is interpreted as a measure of belief or as a long-run relative frequency, its mathematical properties remain unchanged. I personally think of probability in terms of the frequentist interpretation but it is equally valid to view probability as a measure of belief; all results in the course are equally applicable to both interpretations. VI. Propositions about Probability Function Based on Axioms (Section 2.4) C Proposition 4.1: P( E )  1  P( E ) . C Proof: Because E  E  S , by Axiom 2 we have P( E  E C )  P ( S )  1 . Because E and E C are mutually exclusive, it follows from Axiom 3 that P( E  E C )  P ( E )  P( E C ) . C C Thus, P( E )  P( E  E )  P( E )  1  P( E ) . Example 5: In a certain population, 10% of the people are rich, 5% are famous and 3% are rich and famous. For a person picked at random from this population (meaning that each person has an equal probability of being picked), what is the chance that the person is not rich? Proposition 4.2: If E  F (meaning that every outcome in E is contained in F ), then P( E )  P( F ) . Proof: Note that the event F may be written in the form F  E  (F  EC ) , where E and F  E C are mutually exclusive. Therefore, by Axiom 3, P( F )  P( E )  P( F  E C ) . By Axiom 1, P( F  E C )  0 so that P( F )  P( E ) . Furthermore, from the proof of Proposition 4.2, we have the difference rule that if E  F , P( F and not E )  P( F  E C )  P( F )  P( E ) . Example 5 continued: For a person picked at random from the population, what is the chance that the person is rich but not famous? Proposition 4.3: P( E  F )  P( E )  P( F )  P( E  F ) . Proof: The Venn diagram suggests the statement of the proposition is true. More formally, we have from Axiom 3 that P( E )  P ( E  F C )  P ( E  F ) P( F )  P( E  F )  P( E C  F ) P( E  F )  P ( E  F C )  P ( E  F )  P ( E C  F ) From the first two equations, we have that P( E  F C )  P( E )  P( E  F ), P( E C  F )  P( F )  P( E  F ) Substituting these expressions in the expression for P ( E  F ) , we conclude that P( E  F )  P( E )  P( F )  P( E  F ) . Note: Proposition 4.3 can be extended to provide an expression for P( E1  E2   En ) ; see Proposition 4.4, the inclusion-exclusion identity). Example 5 continued: What is the chance that the randomly selected person is either rich or famous? Example 6: Winthrop, a premed student, has been summarily rejected by all 126 U.S. medical schools. Desperate, he sends his transcripts and MCATs to the two least selective campuses he can think of, the two branch campuses ( X and Y ) of Swampwater Tech. Based on the success his friends have had there, he estimates that his probability of being accepted at X is 0.7, and at Y , 0.4. He also suspects that there is a 75% chance that at least one of his applications will be rejected. What is the probability that at least one of the schools will accept him?

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Statistics 510: Notes 1