* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 1. Random events and the probability Definition: A
Survey
Document related concepts
Transcript
Chapter 1. Random events and the probability Definition: A probability is a number between 0 and 1 representing how likely it is that an event will occur. Probabilities can be: 1. Frequentist (based on frequencies), 2. Subjective: probability represents a person’s degree of belief that an event will occur, e.g. I think there is an 80% chance it will rain today, written as P(rain) = 0.80. Definition: A random experiment is an experiment whose outcome is not known until it is observed. Definition: A sample space, Ω, is a set of outcomes of a random experiment. Definition: A sample point is an element of the sample space. Experiment: Toss a coin twice and observe the result. Sample space: Ω = {HH,HT, TH, TT} An example of a sample point is: HT Experiment: Toss a coin twice and count the number of heads. Sample space: Ω = {0, 1, 2} Experiment: Toss a coin twice and observe whether the two tosses are the same (e.g. HH or TT). Sample space: Ω = {same, different} Definition: An event is a subset of the sample space. Events will be denoted by capital letters A,B,C,... . Note:We say that event A occurs if the outcome of the experiment is one of the elements in A. Example: Toss a coin twice. Sample space: Ω = {HH,HT, TH, TT} Let event A be the event that there is exactly one head. We write: A =“exactly one head” Then A = {HT, TH}. Note:A is a subset of Ω, as in the definition. We write A ⊂ Ω. Definition: Event A occurs if we observe an outcome that is a member of the set A. Note: Ω is a subset of itself, so Ω is an event. The empty set, = {}, is also a subset of Ω. This is called the null event, or the event with no outcomes. Example: Experiment: throw 2 dice. Sample space: Ω = {(1, 1), (1, 2), . . . , (1, 6), (2, 1), (2, 2), . . . , (2, 6), . . . , (6, 6)} Event A = “sum of two faces is 5” = {(1, 4), (2, 3), (3, 2), (4, 1)} Definition: Let A and B be events on the same sample space Ω: so A ⊂ Ω and B ⊂ Ω. Definition: The complement of event A is written A and is given by Experiment: Pick a person in this class at random. Sample space: Ω = {all people in class}. Let event A =“person is male” and event B =“person travelled by bike today”. Suppose I pick amale who did not travel by bike. Say whether the following events have occurred: did occur. Yes. Venn diagrams are generally useful for up to 3 events, although they are not used to provide formal proofs. For more than 3 events, the diagram might not be able to represent all possible overlaps of events. (This was probably the case for our transport Venn diagram.) The following properties hold. For any sets A, B, and C: and Definition: Two events A and B are mutually exclusive, or disjoint, if This means events A and B cannot happen together. If A happens, it excludes B from happening, and vice-versa. Note: Does this mean that A and B are independent? No: quite the opposite. A EXCLUDES B from happening, so B depends strongly on whether or not A happens. Definition: Any number of events A1 , A2 ,, An are mutually exclusive if every pair of the events is mutually exclusive: ie. Definition: A partition of the sample space Ω is a collection of mutually exclusive events whose union is Ω. . form a partition of A. We will see that this is very useful for finding the probability of event A. This is because it is often easier to find the probability of small ‘chunks’ of A (the partitioned sections) than to find the whole probability of A at once. The partition idea shows us how to add the probabilities of these chunks together: see later. 1.4 Probability The ingredient in the model for a random experiment is the specification of the probability of the events. It tells us how likely it is that a particular event will occur. Definition A probability P is a rule (function) which assigns a positive number to each event, and which satisfies the following axioms: As a direct consequence of the axioms we have the following properties for P. Theorem Let A and B be events. Then, 6. If A1 , A2 , n P( i 1 n Ai ) P( Ai ) i 1 , An are n arbitrary events in Ω, then 1i j n P( Ai Aj ) 1i j k n P( Ai Aj Ak ) (1)n1 P( A1 An ) 7. If A1 , A2 , , An is a finite sequence of mutually exclusive events in Ω ( Ai A j , i j ), then Examples of basic probability calculations 300 Australians were asked about their car preferences in 1998. Of the respondents, 33% had children. The respondents were asked what sort of car they would like if they could choose any car at all. 13% of respondents had children and chose a large car. 12% of respondents did not have children and chose a large car. Find the probability that a randomly chosen respondent: (a) would choose a large car; (b) either has children or would choose a large car (or both). First formulate events: Next write down all the information given: (a) Asked for P(L). (b) Asked for P(L∪C). Respondents were also asked their opinions on car reliability and fuel consumption. 84% of respondents considered reliability to be of high importance, while 40% considered fuel consumption to be of high importance. Formulate events: R = “considers reliability of high importance”, F = “considers fuel consumption of high importance”. Information given: P(R) = 0.84 P(F) = 0.40. (d) We can not calculate P(R∩F) from the information given. (e) Given the further information that 12% of respondents considered neither reliability nor fuel consumption to be of high importance, find P(R ∪F) and P(R∩F). Probability that respondent considers either reliability or fuel consumption, or both, of high importance. Probability that respondent considers BOTH reliability AND fuel consumption of high importance. 1.5 Conditional Probability Conditioning is another of the fundamental tools of probability: probably the most fundamental tool. It is especially helpful for calculating the probabilities of intersections, such as P(A|B), which themselves are critical for the useful Partition Theorem. Additionally, the whole field of stochastic processes is based on the idea of conditional probability. What happens next in a process depends, or is conditional, on what has happened beforehand. Dependent events Suppose A and B are two events on the same sample space. There will often be dependence between A and B. This means that if we know that B has occurred, it changes our knowledge of the chance that A will occur. Example: Toss a die once. However, if we know that B has occurred, then there is an increased chance that A has occurred: Conditioning as reducing the sample space The car survey in Examples of basic probability calculations asked respondents which they valued more also highly in a car: ease of parking, or style/prestige. Here are the responses: Suppose we pick a respondent at random from all those in the table.Let event A =“respondent thinks that prestige is more important”. However, this probability di ers between males and females. Suppose we reduce our sample space from This is our definition of conditional probability: Definition: Let A and B be two events. The conditional probability that event A occurs, given that event B has occurred, is written P(A|B), and is given by Note: Follow the reasoning above carefully. It is important to understand why the conditional probability is the probability of the intersection within the new sample space Conditioning on event B means changing the sample space to B. Think of P(A|B) as the chance of getting an A, from the set of B's only. The Multiplication Rule For any events A and B, New statement of the Partition Theorem The Multiplication Rule gives us a new statement of the Partition Theorem: Both formulations of the Partition Theorem are very widely used, but especially the conditional formulation Examples of conditional probability and partitions Tom gets the bus to campus every day. The bus is on time with probability 0.6, and late with probability 0.4. The sample space can be written as We can formulate events as follows: T = “on time”; L = “late”. From the information given, the events have probabilities: P(T) = 0:6 ; P(L) = 0:4: (a) Do the events T and L form a partition of the sample space? Explain why or why not. Yes: they cover all possible journeys (probabilities sum to 1), and there is no overlap in the events by definition. The buses are sometimes crowded and sometimes noisy, both of which are problems for Tom as he likes to use the bus journeys to do his Stats assignments. When the bus is on time, it is crowded with probability 0.5. When it is late, it is crowded with probability 0.7. The bus is noisy with probability 0.8 when it is crowded, and with probability 0.4 when it is not crowded. (b) Formulate events C and N corresponding to the bus being crowded and noisy. Do the events C and N form a partition of the sample space? Explain why or why not. Let C = “crowded”, N =“noisy”. C and N do NOT form a partition of . It is possible for the bus to be noisy when it is crowded, so there must be some overlap between C and N. (c) Write down probability statements corresponding to the information given above. Your answer should involve two statements linking C with T and L, and two statements linking N with C. (d) Find the probability that the bus is crowded. (e) Find the probability that the bus is noisy. Bayes' Theorem: inverting conditional probabilities Then This is the simplest form of Bayes' Theorem, named after Thomas Bayes (1702-61), English clergyman and founder of Bayesian Statistics. Bayes' Theorem allows us to “invert” the conditioning,i.e. to express P(B| A) in terms of P(A|B). This is very useful. For example, it might be easy to calculate, P(later event|earlier event); but we might only observe the later event and wish to deduce the probability that the earlier event occurred, P(earlier event| later event): Full statement of Bayes' Theorem: Example: The case of the Perfidious Gardener. Mr Smith owns a hysterical rosebush. It will die with probability 1/2 if watered, and with probability 3/4 if not watered. Worse still, Smith employs a perfidious gardener who will fail to water the rosebush with probability 2/3. Smith returns from holiday to find the rosebush . . . DEAD!! What is the probability that the gardener did not water it? So the gardener failed to water the rosebush with probability 3/4. Example: The case of the Defective Ketchup Bottle. Ketchup bottles are produced in 3 different factories, accounting for 50%, 30%, and 20% of the total output respectively. The percentage of defective bottles from the 3 factories is respectively 0.4%, 0.6%, and 1.2%. A statistics lecturer who eats only ketchup finds a defective bottle in her wig. What is the probability that it came from Factory 1? More than two events To find P( A1 A2 A3 ) , we can apply the multiplication rule successively: Example: A box contains w white balls and r red balls. Draw 3 balls without replacement. What is the probability of getting the sequence white, red, white? 1.6 Probabilities from combinatorics: equally likely outcomes Sometimes, all the outcomes in a discrete finite sample space are equally likely. This makes it easy to calculate probabilities. If Example: For a 3-child family, possible outcomes from oldest to youngest are: Let { p1 , p2 , , p8} be a probability distribution on . If every baby is equally likely to be a boy or a girl, then all of the 8 outcomes in are equally likely, so p1 p2 1 p8 . 8 Event A contains 4 of the 8 equally likely outcomes, so event A occurs 4 8 1 2 with probability P( A) . Counting equally likely outcomes The number of permutations, n Pr , is the number of ways of selecting r objects from n distinct objects when different orderings constitute different choices. The number of combinations, n C r , is the number of ways of selecting r objects from n distinct objects when different orderings constitute the same choice. Then Use the same rule on the numerator and the denominator When P( A) # outcomes in A , we can often think about the problem # outcomes in either with different orderings constituting different choices, or with different orderings constituting the same choice. The critical thing is to use the same rule for both numerator and denominator. Example: (a) Tom has five elderly great-aunts who live together in a tiny bungalow. They insist on each receiving separate Christmas cards, and threaten to disinherit Tom if he sends two of them the same picture. Tom has Christmas cards with 12 different designs. In how many different ways can he select 5 different designs from the 12 designs available? Order of cards is not important, so use combinations. Number of ways of selecting 5 distinct designs from 12 is b) The next year, Tom buys a pack of 40 Christmas cards, featuring 10 di erent pictures with 4 cards of each picture. He selects 5 cards at random to send to his great-aunts. What is the probability that at least two of the great-aunts receive the same picture? Total number of outcomes is (Note: order mattered above, so we need order to matter here too.) So Thus P(A)=P(at least 2 cards are the same design) =1 P( A) 1 0.392 0.608. 1.7 Statistical Independence Two events A and B are statistically independent if the occurrence of one does not affect the occurrence of the other. We use this as our definition of statistical independence. For more than two events, we say: Statistical independence for calculating the probability of an intersection We usually have two choices. 1. IF A and B are statistically independent, then 2. If A and B are not known to be statistically independent, we usually have to use conditional probability and the multiplication rule: This still requires us to be able to calculate P(A|B). Note: If events are physically independent, then they will also be statistically independent. Pairwise independence does not imply mutual independence Example: A jar contains 4 balls: one red, one white, one blue, and one red, white& blue. Draw one ball at random. So A, B and C are NOT mutually independent, despite being pairwise independent.