Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
3. Probability 3.1 Definitions The probability of an event is a numerical measure of its likelihood. It is a number between 0 and 1, and larger the number, larger the likelihood. A probability of 0 means that the event certainly will not occur and a probability of 1 that it certainly will. Before we get to the concepts of probability we need to overview some concepts from set theory. We start with the concept of an element. If we are concerned with all the employees of a company, then each employee, and nothing else, is an element. If we are concerned with the number of defectives among 100 parts that are being inspected then each integer 0, 1, 2, ..., 100, and nothing else, is an element. Definition 3.1.1: A set is a collection of elements. We denote sets by uppercase letters A, B etc. and elements by lower case letters x, y etc. If x belongs to A we write x A . If x does not belong to A, we write x A . Two sets are equal if they contain the same collection of elements. Definition 3.1.2: The universal set is one that contains all the elements. We denote the universal set by S. Definition 3.1.3: The empty set is one that contains no elements. We denote the empty set by . Definition 3.1.4: If every element that belongs to A also belongs to B, then A is a subset of B. If A is a subset of B then we write A B . Definition 3.1.5: The intersection of two sets A and B is the set of all elements that belong to both A and B. We denote the intersection of A and B as A B . Definition 3.1.6: The union of two sets A and B is the set of all elements that belong to A or B or both. We denote the union of A and B as A B . Note that A B denotes elements in A and B, and A B denotes elements in A or B. Definition 3.1.7: The complement of a set A is the set of all elements that do not belong to A. We denote the complement of A by A . A ; A A; A S A; A S S; S ; S ; A A ; A A S . Definition 3.1.8: Two sets A and B are said to be disjoint if they have no element in common. We shall now relate these set theory concepts to probability theory. Suppose an experiment may result in any one of several outcomes. The outcomes then become the elements. In probability theory, elements are referred to as sample points, the universal set is referred to as the sample space and a set of sample points is referred to as an event. We say that an event has occurred if any one of the sample points that belong to it has occurred. Definition 3.1.9: If every sample point in a finite sample space is equally likely, then the probability of an event is the number of sample points that belong to the event divided by the number of sample points in the whole sample space. Let P(A) denote the probability of event A, n(A) the number of sample points in A and n(S) the number of sample points in S. We can then write the formula for P(A) as P(A) = n(A)/n(S). (3.1.1) Suppose we toss a die. There are six sample points 1, 2, ..., 6 each of which is equally likely. Let A denote the event where an odd number shows up. Event A then contains three sample points 1, 3, and 5. The probability of A is therefore 3/6 or 1/2. Suppose we pick a card at random from a standard deck of playing cards. There are 52 sample points in the sample space, each one equally likely. Let A denote the event we pick an ace. Then event A contains 4 sample points and therefore the probability of A is 4/52 = 1/13. Often, we need to calculate the probability of the intersection or union of events. Formula 3.1.1 can be extended for this purpose. We have P( A B ) = n( A B )/n(S) (3.1.2) P( A B ) is also known as the joint probability of events A and B. P( A B ) = n( A B )/n(S). (3.1.3) 3-1 A ) = n(S) n(A) we can write P( A ) = [n(S) n(A)]/n(S) which gives P( A ) = 1 P(A) (3.1.4) Consider n( A B ) in Equation 3.1.3. Can we replace it with [n(A) + n(B)]? No. If A and B have some Noting that n( common elements, then they will be counted twice in [n(A) + n(B)]. To avoid this double counting, we can subtract n( A B ) from [n(A) + n(B)] and get n( A B ) = n(A) + n(B) n( A B ). Equation 3.1.3 can then be rewritten as P( A B ) = P(A) + P(B) P( A B ) (3.1.5) This Equation 3.1.5 is very useful since most of the time the right hand side probabilities are known and we will need to calculate P( A B ). This equation is also known as the Addition Rule for probabilities. S A *** ** * B * * ** ** *** Figure 3.1.1. A Venn diagram Let us apply the above formulas to the case depicted in the Venn diagram of Figure 3.1.1. We see the sample space S with two events A and B. Let each of the *’s represent a sample point, and assume that they are all equally likely. Now, n(A) = 8 and n(S) = 15. Hence, P(A) = 8/15. Similarly, P(B) = 6/15 or 2/5; P( A B ) = 2/15; P( A B ) = 12/15. Note that the equations 3.1.4 and 3.1.5 are satisfied. Next we shall see an important concept called conditional probability. Refer to the Venn diagram in Figure 3.1.1. When we are calculating P(A), let us say we are told event B has occurred. In other words, the actual outcome is one of the *’s in B. Can we still say P(A) = 8/15? No. The sample points we can consider now are only those inside B. In effect, B has become our new sample space. Within this new sample space which contains 6 sample points, 2 belong to A. Hence, P(A) given that event B has occurred is 2/6 or 1/3. We call this the conditional probability of A given B and denote it by P(A | B). The event written after the vertical line | is the condition. We shall write this as our next formula. P(A | B) = n( A B )/n(B) (3.1.6) Combining the formulas 3.1.1, 3.1.2 and 3.1.6, we can write P(A | B) = P( A B )/P(B) (3.1.7) Formula 3.1.7 is the famous Bayes’ Rule named after an 18th century English clergyman who wrote much about this rule. This formula can be re-written with P( A B ) on the left hand side: P( A B ) = P(B)*P(A | B) (3.1.8a) Or, we may interchange A and B and write the above formula as: P( A B ) = P(A)*P(B | A) (3.1.8b) The Formula 3.1.8 (a or b) is known as the Multiplication Rule of probabilities. At times P(A) and P(A | B) may be the same, which would imply that the occurrence of event B did not in any way influence the probability of event A. We then say A and B are independent events. Definition 3.1.10: Two events A and B are said to be independent if P(A) = P(A | B). As an example, consider the experiment of selecting one adult at random from a community. The sample space S then contains all the adults in the community. Let A denote those with lung disease and B denote those who smoke. If P(A) is found to be equal to P(A | B), then we have reason to believe that smoking does not influence the chances of one getting a lung disease. We would then declare that smoking and getting a lung disease are independent. If the opposite is true, we would then declare them dependent. Thus, independence among events is an important practical concept. It can be shown that if A and B are independent events then the pairs A & B are all independent pairs. 3-2 B , A & B, A & In Figure 3.1.1, we see that P(A) = 8/15 and P(A | B) = 1/3. Hence A and B are not independent events. An important consequence of independence is that the joint probability of two independent events will be equal to the product of their individual probabilities. Looking at the definition of independence and Formula 3.1.8a, we can write “when A and B are independent P( A B ) = P(A)*P(B).” (3.1.9) Formula 3.1.9 extends to any number of independent events: P( A1 A2 ... An ) = P(A1)*P(A2)*...*P(An). (3.1.10) 3.2 Cross Tabulation Template Figure 3.2.1. Cross Tabulation, Joint, Marginal & Conditional Probabilities [Workbook: Chap003.xls; Sheet: Crosstabs] The template for calculating joint, marginal and conditional probabilities starting from a cross tabulation is shown in Figure 3.2.1. For the data shown in the figure, the marginal probability of Row 1 is 0.2599 and that of Col 1 is 0.3061. The conditional probability P(Row 1| Col 1) = 0.2543 and the conditional probability P(Col 1 | Row 1) = 0.2995. You may enter short names in place of Row 1, Col 1 etc. If the data you have is not a cross tabulation but a joint probability table, then enter the joint probabilities, at they are, in the cross tabulation data area. The template will still calculate all the probabilities correctly. 3.3 Exercises 1. Do exercise 3-36 to 46 in the textbook. [Hint: Construct and use a Joint Probability Table.] 2. Do exercises 3-49, 50 and 51 in the textbook. 3-3