Download Lecture 4

Document related concepts

History of randomness wikipedia , lookup

Indeterminism wikipedia , lookup

Randomness wikipedia , lookup

Probability box wikipedia , lookup

Birthday problem wikipedia , lookup

Boy or Girl paradox wikipedia , lookup

Risk aversion (psychology) wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Inductive probability wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Applied Data Analysis
Spring 2017
Course information: TA office hours
Karen Albert
[email protected]
Thursdays, 4-5 PM (Hark 302)
Lecture outline
1. Why probability?
2. What is probability?
3. The axioms
4. Results derived from the axioms
Why probability?
Why probability?
• So you don’t play online poker and spend the rest of your
life in unremitting poverty.
Why probability?
• So you don’t play online poker and spend the rest of your
life in unremitting poverty.
• So the next time someone offers to sell you a lottery ticket,
you reply that lotteries are “a tax on the stupid.”
Why probability?
• So you don’t play online poker and spend the rest of your
life in unremitting poverty.
• So the next time someone offers to sell you a lottery ticket,
you reply that lotteries are “a tax on the stupid.”
• More seriously, we need probability to make inferences.
Inference
That which is inferred, a conclusion drawn from data or
premisses. Also, an implication; the conclusion that one is
intended to draw.
“To draw inference has been said to be the great business of
life.”
from Logic from J. S. Mill
Inference
That which is inferred, a conclusion drawn from data or
premisses. Also, an implication; the conclusion that one is
intended to draw.
“To draw inference has been said to be the great business of
life.”
from Logic from J. S. Mill
Also said,
“Conservatives are not necessarily stupid, but most stupid
people are conservatives.”
Example
Suppose we want to know the percentage of Americans who
approve of this buffoon President. How could we go about
answering this question?
Method 1
We could ask all Americans.
Method 1
We could ask all Americans.
Two problems:
Method 1
We could ask all Americans.
Two problems:
1. Americans are germy.
Method 1
We could ask all Americans.
Two problems:
1. Americans are germy.
2. This procedure is prohibitively expensive.
Method 2
We could take a small sample from that population, and then
make an inference from that sample to the wider population.
We need probability for both steps in method 2. We need it to
help collect a representative sample and in order to make
inferences to the wider population.
Let’s think about chance
What is the difference between the following two claims?
Let’s think about chance
What is the difference between the following two claims?
1. Karen has a 70% chance (let’s be optimistic) of landing a
tenure track job in political science.
2. The chance of the ball landing on red when the roulette
wheel is spun is 18/38.
Let’s think about chance
What is the difference between the following two claims?
1. Karen has a 70% chance (let’s be optimistic) of landing a
tenure track job in political science.
2. The chance of the ball landing on red when the roulette
wheel is spun is 18/38.
The roulette wheel can be spun over and over again under
exactly the same conditions. The same is not true for getting a
job.
Two doctrines of chance
Two doctrines of chance
1. Subjective or personal probability
The probability of an uncertain event happening is the
“degree of belief” in the event held by the individual given
their experience and information.
Two doctrines of chance
1. Subjective or personal probability
The probability of an uncertain event happening is the
“degree of belief” in the event held by the individual given
their experience and information.
2. Objective or frequentist probability
The chance of something gives the percentage of the time
it is expected to happen when the process is done over
and over again under the same conditions.
Random experiments
A random experiment is a chance mechanism which satisfied
the following conditions:
1. all possible outcomes are known a priori
2. in any particular trial, the outcome is not known a priori but
there exists a discernible regularity of occurrence
3. it can be repeated under identical conditions
Random experiments
A random experiment is a chance mechanism which satisfied
the following conditions:
1. all possible outcomes are known a priori
2. in any particular trial, the outcome is not known a priori but
there exists a discernible regularity of occurrence
3. it can be repeated under identical conditions
Examples:
• Tossing a coin
• Randomly choosing voters from a population
The axiomatic foundations of probability
Andrey Kolmogorov (1903-1987)
The axioms of probability
The axioms of probability
Axiom 1 Pr(S) = 1, when S is all the possible outcomes
The axioms of probability
Axiom 1 Pr(S) = 1, when S is all the possible outcomes
Axiom 2 Pr(A) ≥ 0, for any event A
The axioms of probability
Axiom 1 Pr(S) = 1, when S is all the possible outcomes
Axiom 2 Pr(A) ≥ 0, for any event A
Axiom 3 If events A and B are mutually exclusive,
Pr(A ∪ B) = Pr(A) + Pr(B)
Mutually exclusive
Two events are mutually exclusive if they have no outcomes in
common.
Mutually exclusive
Two events are mutually exclusive if they have no outcomes in
common.
Mutually exclusive:
Mutually exclusive
Two events are mutually exclusive if they have no outcomes in
common.
Mutually exclusive:
• heads or tails
• male or female
• drawing a king or an ace
Mutually exclusive
Two events are mutually exclusive if they have no outcomes in
common.
Mutually exclusive:
• heads or tails
• male or female
• drawing a king or an ace
Not mutually exclusive:
Mutually exclusive
Two events are mutually exclusive if they have no outcomes in
common.
Mutually exclusive:
• heads or tails
• male or female
• drawing a king or an ace
Not mutually exclusive:
• drawing a king and a heart
• turning left and scratching your...head
Axiom 3 (a.k.a. the addition rule)
If the two events are mutually exclusive (disjoint means the
same thing), you can add their probabilities.
Axiom 3 (a.k.a. the addition rule)
If the two events are mutually exclusive (disjoint means the
same thing), you can add their probabilities.
The probability of drawing a king is 1/13.
Axiom 3 (a.k.a. the addition rule)
If the two events are mutually exclusive (disjoint means the
same thing), you can add their probabilities.
The probability of drawing a king is 1/13.
The probability of drawing an ace is 1/13.
Axiom 3 (a.k.a. the addition rule)
If the two events are mutually exclusive (disjoint means the
same thing), you can add their probabilities.
The probability of drawing a king is 1/13.
The probability of drawing an ace is 1/13.
The probability of drawing a king or an ace is 2/13.
Pr(K ∪ A) = Pr(K ) + Pr(A)
1
1
=
+
13 13
2
=
13
Theorems of probability
We get the theorems of probability by mathematical deduction.
That is, the theorems are derived from the axioms using
deductive logical inference.
Theorem 1
Pr(Ā) = 1 − Pr(A)
Theorem 1
Pr(Ā) = 1 − Pr(A)
Proof
Pr(S) = 1 (by axiom 1)
Pr(Ā ∪ A) = 1
Pr(Ā) + Pr(A) = 1 (by axiom 3)
Pr(Ā) = 1 − Pr(A)
Theorem 1: example
What is the probability of not drawing a king?
Theorem 1: example
What is the probability of not drawing a king?
Pr(K̄ ) = 1 − Pr(K )
12
1
= 1−
13
13
Theorem 2
Pr(∅) = 0
Theorem 2
Pr(∅) = 0
Proof
Assume A = S (thus, Ā = ∅) and use theorem 1.
Theorem 2
Pr(∅) = 0
Proof
Assume A = S (thus, Ā = ∅) and use theorem 1.
Example
When I draw a card, I am going to get an outcome.
Theorem 3
Pr(A) ≤ 1, if A ⊂ S
Theorem 3
Pr(A) ≤ 1, if A ⊂ S
Proof
Pr(S) = 1 (by axiom 1)
Pr(Ā ∪ A) = 1
Pr(Ā) + Pr(A) = 1 (by axiom 3)
Pr(A) ≤ 1
Theorem 3
Pr(A) ≤ 1, if A ⊂ S
Proof
Pr(S) = 1 (by axiom 1)
Pr(Ā ∪ A) = 1
Pr(Ā) + Pr(A) = 1 (by axiom 3)
Pr(A) ≤ 1
Example
The probability of a head must be 1 or less.
Theorem 4
If A ⊂ B, then Pr(A) ≤ Pr(B)
Theorem 4
If A ⊂ B, then Pr(A) ≤ Pr(B)
Proof
Let B = A ∪ (Ā ∩ B).
A and (Ā ∩ B) are mutually exclusive.
Pr(B) = Pr(A) + Pr(Ā ∩ B) (by axiom 3)
Pr(B) ≥ Pr(A)
Theorem 4
If A ⊂ B, then Pr(A) ≤ Pr(B)
Proof
Let B = A ∪ (Ā ∩ B).
A and (Ā ∩ B) are mutually exclusive.
Pr(B) = Pr(A) + Pr(Ā ∩ B) (by axiom 3)
Pr(B) ≥ Pr(A)
Example
The probability of a red King must be less than the probability
of a King.
Theorem 5
When A and B are not mutually exclusive,
Pr(A ∪ B) = Pr(A) + Pr(B) − Pr(A ∩ B)
Theorem 5
When A and B are not mutually exclusive,
Pr(A ∪ B) = Pr(A) + Pr(B) − Pr(A ∩ B)
Proof
Let C = {A − (A ∩ B)}.
Now B and C are mutually exclusive.
Pr(A ∪ B) = Pr(C ∪ B)
= Pr{A − (A ∩ B)} + Pr(B) (by axiom 3)
= Pr(A) + Pr(B) − Pr(A ∩ B)
Theorem 5: example
What is the probability of drawing a king or a heart?
Theorem 5: example
What is the probability of drawing a king or a heart?
Pr(K ∪ H) = Pr(K ) + Pr(H) − Pr(K ∩ H)
13
1
4
+
−
=
52 52 52
16
=
52
Theorem 6
For mutually exclusive events A1 , . . . , An ,
!
n
n
[
X
Pr
Ai =
Pr(Ai )
i=1
i=1
Theorem 6
For mutually exclusive events A1 , . . . , An ,
!
n
n
[
X
Pr
Ai =
Pr(Ai )
i=1
i=1
Proof
Let A1 , . . . , be an infinite sequence of events.
A1 , . . . , An are n given disjoint events
Ai = ∅ for i > n.
Theorem 6: proof cont.
Pr
n
[
!
Ai
= Pr
i=1
∞
[
!
Ai
i=1
=
=
=
=
∞
X
i=1
n
X
i=1
n
X
i=1
n
X
i=1
Pr(Ai ) (by axiom 3)
Pr(Ai ) +
∞
X
i=n+1
Pr(Ai ) + 0
Pr(Ai )
Pr(Ai )
Theorem 6: example
What is the probability of a king or an ace or a jack?
Theorem 6: example
What is the probability of a king or an ace or a jack?
Pr(K ∪ A ∪ J) = Pr(K ) + Pr(A) + Pr(J)
1
1
1
+
+
=
13 13 13
3
=
13
What did we learn?
• We learn probability to help us make inferences.
• Probability is the percentage of time that something is
expected to happen when the process is done over and
over again under the same conditions.
• There are three axioms of probability.
• From those axioms, we derived 6 theorems of probability.