Download Probability. Exercises Ch.4. More on Stata

Statistical Methods for the Social Sciences, Autumn 2012 Review Session 3: Probability. Exercises Ch.4. More on Stata TA: Anastasia Aladysheva [email protected] Office hours: Mon 10:00-12:00 Rigot 10 October 10, 2012 1 / 39 Plan for the RS3: 1. Some basic probability concepts: follow-up on the lecture 3; 2. Exercises Chapter 4 from the textbook; 3. More on Stata: PS2 2 / 39 Basic Concepts of Probability Theory Define: I P : Probability Function I Ω : Sample space I A : Event and a subset A ⊆ Ω A probability space is a pair (Ω P) consisting of a set Ω and a function P which assigns to each subset A of Ω a real number P(A) in the interval [0 1]. For an event A, the real number P(A) is called the probability of A. Axiom: I P(Ω) = 1 Example: ”throw a dice” Consider the set Ω = { 1 2 3 4 5 6 }. For each subset A of Ω, define: in A P(A) = number of elements . Then the pair (Ω P) is a probability space. One can view 6 this probability space as a model for the situation ”throw a dice”. 3 / 39 Enumerate the sample space: examples Example 1: ”flip a coin” 4 / 39 Enumerate the sample space: examples Example 1: ”flip a coin” ANSWER: Ω = { heads tails } 5 / 39 Enumerate the sample space: examples Example 1: ”flip a coin” ANSWER: Ω = { heads tails } Example 2: ”pick a ball” An urn contains a red ball, a blue ball and a yellow ball. I pick one ball from the urn. 6 / 39 Enumerate the sample space: examples Example 1: ”flip a coin” ANSWER: Ω = { heads tails } Example 2: ”pick a ball” An urn contains a red ball, a blue ball and a yellow ball. I pick one ball from the urn. ANSWER: Ω = { blue red yellow } 7 / 39 Enumerate the sample space: examples Example 1: ”flip a coin” ANSWER: Ω = { heads tails } Example 2: ”pick a ball” An urn contains a red ball, a blue ball and a yellow ball. I pick one ball from the urn. ANSWER: Ω = { blue red yellow } Example 3: ”pick a ball twice, without replacement” 8 / 39 Enumerate the sample space: examples Example 1: ”flip a coin” ANSWER: Ω = { heads tails } Example 2: ”pick a ball” An urn contains a red ball, a blue ball and a yellow ball. I pick one ball from the urn. ANSWER: Ω = { blue red yellow } Example 3: ”pick a ball twice, without replacement” ANSWER: Order matters! Ω = { blue red }, { red blue }, { blue yellow }, { yellow blue }, { red yellow }, { yellow red } 9 / 39 More on Probability I 0 ≤ P(A) ≤ 1 I P(A) = 0: the event will never occur; I P(A) = 1: the event will definitely (100%) occur I Full group of events: P(A1 ) + P(A2 ) + ... + P(An ) = 1 I Two opposite events (tails/heads of a coin): P(A) + P(A) = 1 10 / 39 Calculate probability: simple examples Example 1: The probability of rolling an even number on a fair die is: P(rolling a even number) = 3 1 number of even number faces = = number of faces 6 2 11 / 39 Calculate probability: simple examples Example 1: The probability of rolling an even number on a fair die is: P(rolling a even number) = 3 1 number of even number faces = = number of faces 6 2 Example 2: Suppose there is 5 choices of answer for a question in your exam. What is the probability that you choose the right answer by chance? P(right answer) = number of right answers 1 = number of answers 5 12 / 39 Calculate probability: simple examples Example 1: The probability of rolling an even number on a fair die is: P(rolling a even number) = 3 1 number of even number faces = = number of faces 6 2 Example 2: Suppose there is 5 choices of answer for a question in your exam. What is the probability that you choose the right answer by chance? P(right answer) = number of right answers 1 = number of answers 5 Example 3: What is the probability of picking a jack in a normal 52-card deck? P(jack) = 4 number of jacks = number of cards 52 13 / 39 More on Probability: when can we sum up and when can we multiply the probabilities? 1. We can sum up disjoint events: P(A + B) = P(A) + P(B). Example: ”There are 30 balls in the box: 10 red, 5 blue and 15 white. What’s the probability of picking a coloured one?” Answer: P(coloured ball) = 10 30 + 5 30 = 1 2 14 / 39 More on Probability: when can we sum up and when can we multiply the probabilities? 1. We can sum up disjoint events: P(A + B) = P(A) + P(B). Example: ”There are 30 balls in the box: 10 red, 5 blue and 15 white. What’s the probability of picking a coloured one?” Answer: P(coloured ball) = 10 30 + 5 30 = 1 2 This is an example of the probability of alternative occurrences/events. It happens when we want to know the probability that something, or another thing, happens. 15 / 39 More on Probability: when can we sum up and when can we multiply the probabilities? 1. We can sum up disjoint events: P(A + B) = P(A) + P(B). Example: ”There are 30 balls in the box: 10 red, 5 blue and 15 white. What’s the probability of picking a coloured one?” Answer: P(coloured ball) = 10 30 + 5 30 = 1 2 This is an example of the probability of alternative occurrences/events. It happens when we want to know the probability that something, or another thing, happens. 2. Product of two events A · B: joint realization of two events. Example 1: A - the cell phone has no defect, B - the cell phone is white, A · B the cell phone has no defect and is white. 16 / 39 More on Probability: when can we sum up and when can we multiply the probabilities? 1. We can sum up disjoint events: P(A + B) = P(A) + P(B). Example: ”There are 30 balls in the box: 10 red, 5 blue and 15 white. What’s the probability of picking a coloured one?” Answer: P(coloured ball) = 10 30 + 5 30 = 1 2 This is an example of the probability of alternative occurrences/events. It happens when we want to know the probability that something, or another thing, happens. 2. Product of two events A · B: joint realization of two events. Example 1: A - the cell phone has no defect, B - the cell phone is white, A · B the cell phone has no defect and is white. Example 2: ”What is the probability of obtaining two sixes when rolling two fair dice?” Answer: P(having two sixes) = 1 6 · 1 6 = 1 36 17 / 39 Independent Events Note: in the previous examples 2 a) and 2 b) we considered independent events! Two events A and B are called independent, if P(A · B) = P(A) · P(B) Remember Two events are independent if the probability of one of them is not affected by knowing whether the other has occurred or not. P(A · B) is called joint probability. Example: Throwing two dice (see 2 b): P(having two sixes) = P(rolling 6) · P(rolling 6) 18 / 39 Another example on independent events An urn contains 20 red balls, and 10 blue balls. What is the probability that, when we pick two balls without replacement, we obtain exactly one red and one blue? 19 / 39 Another example on independent events An urn contains 20 red balls, and 10 blue balls. What is the probability that, when we pick two balls without replacement, we obtain exactly one red and one blue? ANSWER: Obviously, there are two ways of obtaining this outcome, i.e. R&B and B&R. The order is important, since we pick balls without replacement. Now, these two ways of obtaining exactly one red and one blue balls are just alternative events. Thus, P(1 red and 1 blue) = P(R&B) + P(B&R) = 20 10 10 20 · + · 30 29 30 29 Here events R&B and B&R are independent. However, P(R&B) 6= P(R) · P(B) P(B&R) 6= P(B) · P(R) Why? 20 / 39 Conditional probability Conditional probability is very important for cases in which events are not independent. In such cases, the probability of an event depends on the occurrence of some other event. In general, for two events A and B, we say that the probability of A given B is: P(A | B) How to compute? Bayes’ formula P(A | B) = P(A · B) P(B) Refer to this page 21 / 39 Compare two examples: What is the probability of drawing two queens in a normal 52-card deck (a) with replacement? (b) without replacement? 22 / 39 Compare two examples: What is the probability of drawing two queens in a normal 52-card deck (a) with replacement? (b) without replacement? ANSWER: (a) With replacement: P(two queens) = 4 4 · 52 52 23 / 39 Compare two examples: What is the probability of drawing two queens in a normal 52-card deck (a) with replacement? (b) without replacement? ANSWER: (a) With replacement: P(two queens) = 4 4 · 52 52 (b) Without replacement: P(two queens) = P(Q on 1st draw)·P(Q on 2nd draw | Q on 1st draw) = 4 3 · 52 51 24 / 39 What is a Random variable? Formal definition: Consider a probability space (Ω P). A random variable is a map X from Ω into the set of real numbers R. A random variable is characterised by a distribution function, expected value, variance, standard deviation. Probability distribution tells you how likely each possible outcome is to be the value of the variable. 25 / 39 Example from the Lecture 3 (slide 7) Toss a fair coin twice y is my random variable defined as the number of heads that I got. There are 4 possible outcomes. Let’s define the sample space: Sample (T , T ) (T , H) (H, T ) (H, H) y 0 1 1 2 It’s probability distribution is defined in such as way: Sample 0 1 2 y 1/4 1/2 1/4 26 / 39 Draw the distribution: discrete y in ”flipping a coin twice” is an example of a discrete random variable We can draw it’s probability distribution in such a way: This is an example of a binomial distribution. 27 / 39 Draw the distribution: continuous What if y is a continuous random variable, i.e. can take on any real value? We can draw it’s probability distribution in such a way: This is an example of a standard normal distribution. The graph above represents the probability density function (pdf). The coloured area is the cumulative probability (cdf). To get it, if we know that a random variable is standard normally distributed, we need to know the Z -score. 28 / 39 What is the difference between normal and standard normal distributions? Recall, that each probability distribution is characterized by mean (µ) and standard deviation (σ). Above that, we know the law, or the pdf, of that distribution. Normal distribution For the normal we have the pdf: f (y ) = 2 2 1 √ e −(y −µ) /2σ σ 2π This function exactly has this bell shape and is symmetric, like we saw on the pictures in the lecture. Standard Normal distribution For the standard normal, plug in µ = 0 and σ = 1 in the pdf above. The resulted function (I don’t present it here) will also has the bell shape, but will be symmetric around 0 (see our previous picture). 29 / 39 What is the difference between normal and standard normal distributions? 30 / 39 Q. How can we calculate the probability (coloured area under the curve) of y if we do not have any software in hands? 1. Standardize y ! How? Convert y to a Z -score using: Z = y −µ σ If y was normally distributed, Z is now standard normally distributed! (Proof: algebra, using the mathematical properties of expected value and variance) 2. Use a table to look for the value of a probability! 3. Table A p. 592 in the textbook, or use a handout table. Be careful: the handout gives the values which represent the area to the left of the Z -score, while in the textbook you have the values to the right! (If you look both values will sum up to 1) 31 / 39 Why we care about normal distribution? Central Limit Theorem provides an answer! Central Limit Theorem For random sampling with a large sample size n, the sampling distribution of the sample mean y is approximately a normal distribution. 32 / 39 Why we care about normal distribution? Central Limit Theorem provides an answer! Central Limit Theorem For random sampling with a large sample size n, the sampling distribution of the sample mean y is approximately a normal distribution. In other words, If we repeatedly select samples of size n from the population, and each time form a particular statistic (in this case sample means y ), we’ll get some variation and a mean of this statistic. If we then draw the probability distribution of y , the graph will resemble of a normal distribution pdf. 33 / 39 Why we care about normal distribution? Central Limit Theorem provides an answer! Central Limit Theorem For random sampling with a large sample size n, the sampling distribution of the sample mean y is approximately a normal distribution. In other words, If we repeatedly select samples of size n from the population, and each time form a particular statistic (in this case sample means y ), we’ll get some variation and a mean of this statistic. If we then draw the probability distribution of y , the graph will resemble of a normal distribution pdf. Why the result of CLT is useful? We can use the normal distribution to find probabilities about y , other statistics, and point estimates. 34 / 39 Resources I Textbook, Chapter 4 I Wikipedia I Other references (more advanced): Hogg and Craig, Greene, Wooldridge appendix, Stock and Watson 35 / 39 Appendix: Bayes’ formula (n events case) I Formula of total probability Theorem Let event A may occur only if one of the events B1 , B2 ,...,Bn , that constitute a full group, occurs. What’s the probability that A occurs? P(A) = P(B1 ) · PB1 (A) + P(B2 ) · PB2 (A) + ... + P(Bn ) · PBn (A) (computation of probability by division into possible cases) Example: In the French Open final, Federer plays the winner of the semifinal between Djokovic and Nadal. A bookmaker estimates that probability of Djokovic winning the semifinal is 75%. The probability that Federer can beat Djokovic is estimated to be 51%, whereas the probability that Federer can beat Nadal is estimated to be 80%. The bookmaker therefore computes the probability that Federer wins the French Open, using division into possible cases, as follows: P(Federer wins the final) = 0.75 · 0.51 + 0.25 · 0.8 36 / 39 Appendix: Bayes’ formula (n events case) II Derivation of Bayes’ formula We have: P(A) = P(B1 ) · PB1 (A) + P(B2 ) · PB2 (A) + ... + P(Bn ) · PBn (A) Suppose that event A indeed occurred. How would we change our hypotheses about PA (B1 ), PA (B2 ),...,PA (Bn )? Let’s get PA (B1 ). We know that the product of two events is equal: P(A · B1 ) = P(A) · PA (B1 ) = P(B1 ) · PB1 (A). P(B1 ) · PB1 (A) or Therefore, PA (B1 ) = P(A) PA (B1 ) = P(B1 ) · PB1 (A) P(B1 ) · PB1 (A) + P(B2 ) · PB2 (A) + ... + P(Bn ) · PBn (A) 37 / 39 Appendix: Bayes’ formula (n events case) III Application of Bayes’ formula Example: Suppose that Bob can decide to go to work by one of three modes of transportation, car, bus, or commuter train. Because of high traffic, if he decides to go by car, there is a 50% chance he will be late. If he goes by bus, which has special reserved lanes but is sometimes overcrowded, the probability of being late is only 20%. The commuter train is almost never late, with a probability of only 1%, but is more expensive than the bus. Suppose that Bob is late one day, and his boss wishes to estimate the probability that he drove to work that day by car. Since he does not know which mode of transportation Bob usually uses, he gives a prior probability of 13 to each of the three possibilities. What is the boss estimate of the probability that Bob drove to work? 38 / 39 Appendix: Bayes’ formula (n events case) IV Solution 1 3 = 0.5 P(bus) = P(car ) = P(train) = P(late)car P(late)train = 0.01 P(late)bus = 0.2 We want to calculate P(car )late . By Bayes’ formula, this is: P(car )late = P(late)car · P(car ) = 0.70 P(late)car · P(car ) + P(late)bus · P(bus) + P(late)train · P(train) 39 / 39

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Probability. Exercises Ch.4. More on Stata