Download Probability. Exercises Ch.4. More on Stata

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Birthday problem wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Statistical Methods for the Social Sciences, Autumn 2012
Review Session 3: Probability. Exercises Ch.4. More on Stata
TA: Anastasia Aladysheva
[email protected]
Office hours: Mon 10:00-12:00
Rigot 10
October 10, 2012
1 / 39
Plan for the RS3:
1. Some basic probability concepts: follow-up on the lecture 3;
2. Exercises Chapter 4 from the textbook;
3. More on Stata: PS2
2 / 39
Basic Concepts of Probability Theory
Define:
I
P : Probability Function
I
Ω : Sample space
I
A : Event and a subset A ⊆ Ω
A probability space is a pair (Ω P) consisting of a set Ω and a function P which
assigns to each subset A of Ω a real number P(A) in the interval [0 1]. For an event
A, the real number P(A) is called the probability of A.
Axiom:
I
P(Ω) = 1
Example: ”throw a dice”
Consider the set Ω = { 1 2 3 4 5 6 }. For each subset A of Ω, define:
in A
P(A) = number of elements
. Then the pair (Ω P) is a probability space. One can view
6
this probability space as a model for the situation ”throw a dice”.
3 / 39
Enumerate the sample space: examples
Example 1: ”flip a coin”
4 / 39
Enumerate the sample space: examples
Example 1: ”flip a coin”
ANSWER: Ω = { heads tails }
5 / 39
Enumerate the sample space: examples
Example 1: ”flip a coin”
ANSWER: Ω = { heads tails }
Example 2: ”pick a ball” An urn contains a red ball, a blue ball and a yellow ball. I
pick one ball from the urn.
6 / 39
Enumerate the sample space: examples
Example 1: ”flip a coin”
ANSWER: Ω = { heads tails }
Example 2: ”pick a ball” An urn contains a red ball, a blue ball and a yellow ball. I
pick one ball from the urn.
ANSWER: Ω = { blue red yellow }
7 / 39
Enumerate the sample space: examples
Example 1: ”flip a coin”
ANSWER: Ω = { heads tails }
Example 2: ”pick a ball” An urn contains a red ball, a blue ball and a yellow ball. I
pick one ball from the urn.
ANSWER: Ω = { blue red yellow }
Example 3: ”pick a ball twice, without replacement”
8 / 39
Enumerate the sample space: examples
Example 1: ”flip a coin”
ANSWER: Ω = { heads tails }
Example 2: ”pick a ball” An urn contains a red ball, a blue ball and a yellow ball. I
pick one ball from the urn.
ANSWER: Ω = { blue red yellow }
Example 3: ”pick a ball twice, without replacement”
ANSWER: Order matters! Ω = { blue red }, { red blue }, { blue yellow }, { yellow
blue }, { red yellow }, { yellow red }
9 / 39
More on Probability
I
0 ≤ P(A) ≤ 1
I
P(A) = 0: the event will never occur;
I
P(A) = 1: the event will definitely (100%) occur
I
Full group of events: P(A1 ) + P(A2 ) + ... + P(An ) = 1
I
Two opposite events (tails/heads of a coin): P(A) + P(A) = 1
10 / 39
Calculate probability: simple examples
Example 1: The probability of rolling an even number on a fair die is:
P(rolling a even number) =
3
1
number of even number faces
= =
number of faces
6
2
11 / 39
Calculate probability: simple examples
Example 1: The probability of rolling an even number on a fair die is:
P(rolling a even number) =
3
1
number of even number faces
= =
number of faces
6
2
Example 2: Suppose there is 5 choices of answer for a question in your exam. What is
the probability that you choose the right answer by chance?
P(right answer) =
number of right answers
1
=
number of answers
5
12 / 39
Calculate probability: simple examples
Example 1: The probability of rolling an even number on a fair die is:
P(rolling a even number) =
3
1
number of even number faces
= =
number of faces
6
2
Example 2: Suppose there is 5 choices of answer for a question in your exam. What is
the probability that you choose the right answer by chance?
P(right answer) =
number of right answers
1
=
number of answers
5
Example 3: What is the probability of picking a jack in a normal 52-card deck?
P(jack) =
4
number of jacks
=
number of cards
52
13 / 39
More on Probability: when can we sum up and when can we multiply the
probabilities?
1. We can sum up disjoint events: P(A + B) = P(A) + P(B).
Example: ”There are 30 balls in the box: 10 red, 5 blue and 15 white. What’s
the probability of picking a coloured one?”
Answer: P(coloured ball) =
10
30
+
5
30
=
1
2
14 / 39
More on Probability: when can we sum up and when can we multiply the
probabilities?
1. We can sum up disjoint events: P(A + B) = P(A) + P(B).
Example: ”There are 30 balls in the box: 10 red, 5 blue and 15 white. What’s
the probability of picking a coloured one?”
Answer: P(coloured ball) =
10
30
+
5
30
=
1
2
This is an example of the probability of alternative occurrences/events. It
happens when we want to know the probability that something, or another thing,
happens.
15 / 39
More on Probability: when can we sum up and when can we multiply the
probabilities?
1. We can sum up disjoint events: P(A + B) = P(A) + P(B).
Example: ”There are 30 balls in the box: 10 red, 5 blue and 15 white. What’s
the probability of picking a coloured one?”
Answer: P(coloured ball) =
10
30
+
5
30
=
1
2
This is an example of the probability of alternative occurrences/events. It
happens when we want to know the probability that something, or another thing,
happens.
2. Product of two events A · B: joint realization of two events.
Example 1: A - the cell phone has no defect, B - the cell phone is white, A · B the cell phone has no defect and is white.
16 / 39
More on Probability: when can we sum up and when can we multiply the
probabilities?
1. We can sum up disjoint events: P(A + B) = P(A) + P(B).
Example: ”There are 30 balls in the box: 10 red, 5 blue and 15 white. What’s
the probability of picking a coloured one?”
Answer: P(coloured ball) =
10
30
+
5
30
=
1
2
This is an example of the probability of alternative occurrences/events. It
happens when we want to know the probability that something, or another thing,
happens.
2. Product of two events A · B: joint realization of two events.
Example 1: A - the cell phone has no defect, B - the cell phone is white, A · B the cell phone has no defect and is white.
Example 2: ”What is the probability of obtaining two sixes when rolling two fair
dice?”
Answer: P(having two sixes) =
1
6
·
1
6
=
1
36
17 / 39
Independent Events
Note: in the previous examples 2 a) and 2 b) we considered independent events!
Two events A and B are called independent, if P(A · B) = P(A) · P(B)
Remember
Two events are independent if the probability of one of them is not affected by
knowing whether the other has occurred or not.
P(A · B) is called joint probability.
Example:
Throwing two dice (see 2 b): P(having two sixes) = P(rolling 6) · P(rolling 6)
18 / 39
Another example on independent events
An urn contains 20 red balls, and 10 blue balls. What is the probability that, when we
pick two balls without replacement, we obtain exactly one red and one blue?
19 / 39
Another example on independent events
An urn contains 20 red balls, and 10 blue balls. What is the probability that, when we
pick two balls without replacement, we obtain exactly one red and one blue?
ANSWER:
Obviously, there are two ways of obtaining this outcome, i.e. R&B and B&R. The
order is important, since we pick balls without replacement. Now, these two ways of
obtaining exactly one red and one blue balls are just alternative events. Thus,
P(1 red and 1 blue) = P(R&B) + P(B&R) =
20 10
10 20
·
+
·
30 29
30 29
Here events R&B and B&R are independent. However,
P(R&B) 6= P(R) · P(B)
P(B&R) 6= P(B) · P(R)
Why?
20 / 39
Conditional probability
Conditional probability is very important for cases in which events are not independent.
In such cases, the probability of an event depends on the occurrence of some other
event. In general, for two events A and B, we say that the probability of A given B is:
P(A | B)
How to compute?
Bayes’ formula
P(A | B) =
P(A · B)
P(B)
Refer to this page
21 / 39
Compare two examples:
What is the probability of drawing two queens in a normal 52-card deck
(a) with replacement?
(b) without replacement?
22 / 39
Compare two examples:
What is the probability of drawing two queens in a normal 52-card deck
(a) with replacement?
(b) without replacement?
ANSWER:
(a) With replacement:
P(two queens) =
4
4
·
52 52
23 / 39
Compare two examples:
What is the probability of drawing two queens in a normal 52-card deck
(a) with replacement?
(b) without replacement?
ANSWER:
(a) With replacement:
P(two queens) =
4
4
·
52 52
(b) Without replacement:
P(two queens) = P(Q on 1st draw)·P(Q on 2nd draw | Q on 1st draw) =
4 3
·
52 51
24 / 39
What is a Random variable?
Formal definition: Consider a probability space (Ω P). A random variable is a map X
from Ω into the set of real numbers R.
A random variable is characterised by a distribution function, expected value,
variance, standard deviation.
Probability distribution tells you how likely each possible outcome is to be the value
of the variable.
25 / 39
Example from the Lecture 3 (slide 7)
Toss a fair coin twice
y is my random variable defined as the number of heads that I got.
There are 4 possible outcomes. Let’s define the sample space:
Sample
(T , T )
(T , H)
(H, T )
(H, H)
y
0
1
1
2
It’s probability distribution is defined in such as way:
Sample
0
1
2
y
1/4
1/2
1/4
26 / 39
Draw the distribution: discrete
y in ”flipping a coin twice” is an example of a discrete random variable
We can draw it’s probability distribution in such a way:
This is an example of a binomial distribution.
27 / 39
Draw the distribution: continuous
What if y is a continuous random variable, i.e. can take on any real value?
We can draw it’s probability distribution in such a way:
This is an example of a standard normal distribution.
The graph above represents the probability density function (pdf). The coloured area
is the cumulative probability (cdf). To get it, if we know that a random variable is
standard normally distributed, we need to know the Z -score.
28 / 39
What is the difference between normal and standard normal distributions?
Recall, that each probability distribution is characterized by mean (µ) and standard
deviation (σ). Above that, we know the law, or the pdf, of that distribution.
Normal distribution
For the normal we have the pdf:
f (y ) =
2
2
1
√ e −(y −µ) /2σ
σ 2π
This function exactly has this bell shape and is symmetric, like we saw on the pictures
in the lecture.
Standard Normal distribution
For the standard normal, plug in µ = 0 and σ = 1 in the pdf above. The resulted
function (I don’t present it here) will also has the bell shape, but will be symmetric
around 0 (see our previous picture).
29 / 39
What is the difference between normal and standard normal distributions?
30 / 39
Q. How can we calculate the probability (coloured area under the curve) of
y if we do not have any software in hands?
1. Standardize y ! How? Convert y to a Z -score using:
Z =
y −µ
σ
If y was normally distributed, Z is now standard normally distributed!
(Proof: algebra, using the mathematical properties of expected value and
variance)
2. Use a table to look for the value of a probability!
3. Table A p. 592 in the textbook, or use a handout table.
Be careful: the handout gives the values which represent the area to the left of
the Z -score, while in the textbook you have the values to the right! (If you look
both values will sum up to 1)
31 / 39
Why we care about normal distribution?
Central Limit Theorem provides an answer!
Central Limit Theorem
For random sampling with a large sample size n, the sampling distribution of the
sample mean y is approximately a normal distribution.
32 / 39
Why we care about normal distribution?
Central Limit Theorem provides an answer!
Central Limit Theorem
For random sampling with a large sample size n, the sampling distribution of the
sample mean y is approximately a normal distribution.
In other words,
If we repeatedly select samples of size n from the population, and each time form a
particular statistic (in this case sample means y ), we’ll get some variation and a mean
of this statistic. If we then draw the probability distribution of y , the graph will
resemble of a normal distribution pdf.
33 / 39
Why we care about normal distribution?
Central Limit Theorem provides an answer!
Central Limit Theorem
For random sampling with a large sample size n, the sampling distribution of the
sample mean y is approximately a normal distribution.
In other words,
If we repeatedly select samples of size n from the population, and each time form a
particular statistic (in this case sample means y ), we’ll get some variation and a mean
of this statistic. If we then draw the probability distribution of y , the graph will
resemble of a normal distribution pdf.
Why the result of CLT is useful?
We can use the normal distribution to find probabilities about y , other statistics, and
point estimates.
34 / 39
Resources
I
Textbook, Chapter 4
I
Wikipedia
I
Other references (more advanced): Hogg and Craig, Greene, Wooldridge
appendix, Stock and Watson
35 / 39
Appendix: Bayes’ formula (n events case) I
Formula of total probability
Theorem
Let event A may occur only if one of the events B1 , B2 ,...,Bn , that constitute a full
group, occurs. What’s the probability that A occurs?
P(A) = P(B1 ) · PB1 (A) + P(B2 ) · PB2 (A) + ... + P(Bn ) · PBn (A) (computation of
probability by division into possible cases)
Example:
In the French Open final, Federer plays the winner of the semifinal between Djokovic and Nadal. A bookmaker
estimates that probability of Djokovic winning the semifinal is 75%. The probability that Federer can beat Djokovic
is estimated to be 51%, whereas the probability that Federer can beat Nadal is estimated to be 80%. The
bookmaker therefore computes the probability that Federer wins the French Open, using division into possible
cases, as follows: P(Federer wins the final) = 0.75 · 0.51 + 0.25 · 0.8
36 / 39
Appendix: Bayes’ formula (n events case) II
Derivation of Bayes’ formula
We have: P(A) = P(B1 ) · PB1 (A) + P(B2 ) · PB2 (A) + ... + P(Bn ) · PBn (A)
Suppose that event A indeed occurred. How would we change our hypotheses about
PA (B1 ), PA (B2 ),...,PA (Bn )?
Let’s get PA (B1 ). We know that the product of two events is equal:
P(A · B1 ) = P(A) · PA (B1 ) = P(B1 ) · PB1 (A).
P(B1 ) · PB1 (A)
or
Therefore, PA (B1 ) =
P(A)
PA (B1 ) =
P(B1 ) · PB1 (A)
P(B1 ) · PB1 (A) + P(B2 ) · PB2 (A) + ... + P(Bn ) · PBn (A)
37 / 39
Appendix: Bayes’ formula (n events case) III
Application of Bayes’ formula
Example:
Suppose that Bob can decide to go to work by one of three modes of transportation,
car, bus, or commuter train. Because of high traffic, if he decides to go by car, there is
a 50% chance he will be late. If he goes by bus, which has special reserved lanes but is
sometimes overcrowded, the probability of being late is only 20%. The commuter
train is almost never late, with a probability of only 1%, but is more expensive than
the bus.
Suppose that Bob is late one day, and his boss wishes to estimate the probability that
he drove to work that day by car. Since he does not know which mode of
transportation Bob usually uses, he gives a prior probability of 13 to each of the three
possibilities. What is the boss estimate of the probability that Bob drove to work?
38 / 39
Appendix: Bayes’ formula (n events case) IV
Solution
1
3
= 0.5
P(bus) = P(car ) = P(train) =
P(late)car
P(late)train = 0.01
P(late)bus = 0.2
We want to calculate P(car )late . By Bayes’ formula, this is:
P(car )late =
P(late)car · P(car )
= 0.70
P(late)car · P(car ) + P(late)bus · P(bus) + P(late)train · P(train)
39 / 39