Download Notes 3 - Wharton Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Birthday problem wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Statistics 510: Notes 3
Reading: Sections 2.2-2.3, 2.7.
I. Sample Spaces and Events (Chapter 2.2)
Four key words for modeling uncertain phenomena:
experiment, sample outcome, sample space and event.
Experiment: Any procedure that (1) can be repeated,
theoretically, an infinite number of times; and (2) has a
well-defined set of possible outcomes.
Probability theory focuses on experiments whose outcome
is not predictable with certainty (random experiments).
Examples:
 Roll a pair of dice
 Measure a person’s blood pressure
 Observe the sex of a newborn child
 Observe the number of hurricanes in a year.
Outcome: Possible outcome of an experiment.
Sample space: Set of all possible outcomes of an
experiment. Denoted by S.
Event: Any subset of the sample space. An event is said to
occur if the outcome of the experiment is one of the
members of the event.
1
Example 1: Experiment is the determination of the sex of a
newborn child. Then
S  {g , b}
where the outcome g means that the child is a girl and
b means the child is a boy.
Example 2: Consider the experiment of flipping a coin
three times. What is the sample space? Which sample
outcomes make up the event A: Majority of coins show
heads?
Example 3: Consider the experiment of tossing a coin until
the first tail appears. What is the sample space of the
experiment?
2
Relations between events
Two types of combinations of events are useful. Let A and
B be any two events defined over the sample space S .
Intersection: The intersection of A and B , written A  B ,
is the event whose outcomes belong to both A and B
(Note: Ross also denotes A  B as AB ).
Union: The union of A and B , written A  B , is the event
whose outcomes belong to either A or B or both.
Example 4: A single card is drawn from a deck of 52 cards.
Let A be the event that an ace is selected.
A  { ace of hearts, ace of diamonds, ace of clubs, ace of
spaces}
Let B be the event “Heart is drawn.”
B  {2 of hearts, 3 of hearts, ..., ace of hearts}
What is A  B and A  B ?
3
We also define unions and intersections of more than two
events in a similar manner. If E1 , E2 , , EN are events, the
N
union of these events, denoted by
n 1
En , is defined to be
that event which consists of all outcomes that are in En for
at least one n . Similarly, the intersection of the events
N
E1 , E2 ,
, EN , denoted by
i 1
En , is defined to be the event
consisting of those outcomes that are in all of the events
En , n  1, , N .
Mutually exclusive events: Two events A and B are said to
be mutually exclusive if they have no outcomes in common
– that is, A  B   , where  denotes the empty set.
Example 4 continued: Let C denote the event “Club is
drawn.”
Events B and C are mutually exclusive.
C
Complement: The complement of an event A , written A ,
is the event consisting of all the outcomes in the sample
space S other than those contained in A .
4
Example 4 continued: B C ={2 of clubs, 3 of clubs,..., ace of
clubs, 2 of spades, ..., ace of spaces, 2 of diamonds, ..., ace
of diamonds.}
Manipulating events: The operations of forming unions,
intersections and complements of events obey certain rules
similar to the rules of algebra, e.g.,
Commutative law: E  F  F  E .
Associative law: ( E  F )  G  E  ( F  G )
A graphical representation that is very useful for illustrating
logical relationships among events is the Venn diagram.
The sample space S is represented as consisting of all the
outcomes in a large rectangle, and the events E , F , G , are
represented as consisting of all the outcomes in given
circles within the rectangle. Events of interest can then be
indicated by shading appropriate regions of the diagram.
Example 5: For two events A and B , we will frequently
need to consider either
5
(a) the event that exactly one (of the two) occurs
(b) the event that at most one (of the two) occurs
Expressions for these events can be found easily from a
Venn diagram.
C
C
(a) ( A  B )  ( B  A )
C
(b) ( A  B)
Demorgan’s Laws:
C
C
C
(1) ( A  B)  A  B
C
C
C
(2) ( A  B)  A  B
II. The Meaning of Probability and the Axioms of
Probability (Section 2.3, 2.7)
A. The Frequency Interpretation of Probability
The relative frequency of an event is a proportion
measuring how often, or how frequently, the event occurs
in a sequence of experiments.
Example 1: Experiment: Toss a coin. Sample space is
S  {heads, tails} .
If the experiment is repeated many times, the relative
frequency of heads will usually be close to ½:
 The French naturalist Count Buffon (1707-1788)
tossed a coin 4040 times. Result: 2048 heads, or
relative frequency 2048/4040=0.5069 for heads.
6
 Around 1900, the English statistician Karl Pearson
heroically tossed a coin 24,000 times. Result: 12,012
heads, a relative frequency of 0.5005.
 While imprisoned by the Germans during World War
II, the Australian mathematician John Kerrich tossed a
coin 10,000 times. Result: 5067 heads, a relative
frequency of 0.5067.
In the frequency interpretation of probability, the
probability of an event A is the expected relative frequency
of A in a large number of trials. In symbols, the proportion
of times A occurs in n trials, call it Pn ( A) , is expected to
be roughly equal to the theoretical probability P( A) if n is
large:
Pn ( A)  P( A) for large n .
Example 2: Experiment: Observation of the sex of a child.
The sample space is S  {girl , boy} . The following table
shows the proportion of boys among live births to residents
of the U.S.A. over the past 20 years (Source: Information
Please Almanac).
Year
1983
1984
1985
1986
1987
1988
1989
Number of births
3,638,933
3,669,141
3,760,561
3,756,547
3,809,394
3,909,510
4,040,958
7
Proportion of boys
0.5126648
0.5122425
0.5126849
0.5124035
0.5121951
0.5121931
0.5121286
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
4,158,212
4,110,907
4,065,014
4,000,240
3,952,767
3,926,589
3,891,494
3,880,894
3,941,553
3,959,417
4,058,814
4,025,933
4,021,726
0.5121179
0.5112054
0.5121992
0.5121845
0.5116894
0.5084196
0.5114951
0.5116337
0.5115255
0.5119072
0.5117182
0.5111665
0.5117154
The relative frequency of boys among newborn children in
the U.S.A. appears to be stable at around 0.512. This
suggests that a reasonable model for the outcome of a
single birth is P(boy )  0.512 and P( girl )  0.488 .
This model for births is equivalent to the sex of a child
being determined by drawing at random with replacement
from a box of 1000 tickets, containing 512 tickets marked
boy and 488 tickets marked girl .
B. The Axioms of Probability
The frequency interpretation of probability is the way that
many scientists think about what probability represents but
it is hard to make it into a rigorous mathematical definition
of probability.
8
Kolmogorov (1933) developed an axiomatic definition of
probability which he then showed can be interpreted, in a
certain sense, as the limit of the relative frequency in a
large number of experiments.
A probability function (measure) on the events in a sample
space is a function on the events P ( E ) that satisfies the
following three axioms:
Axiom 1: 0  P ( E )  1 for all events E .
Axiom 2: P( S )  1 where S is the sample space.
Axiom 3: For any countable sequence of mutually
exclusive events E1 , E2 , (that is, events for which
Ei  E j   when i  j ),

P(
i 1

Ei )   P ( Ei ) .
i 1
We refer to P ( E ) as the probability of an event E .
Using these axioms, we shall be able to prove that if an
experiment is repeated over and over again, then with
probability 1, the proportion of times that a specific event
E occurs converges to P ( E ) , which is essentially the
frequency interpretation of probability. This is called the
strong law of large numbers and we shall prove it in
Chapter 8.
9
Consequences of axioms:
1. P ()  0 .
Proof: Consider the sequence of events E1 , E2 , , where
E1  S and Ei   for i  1 . Then, as the events are
mutually exclusive and as S 

i 1
Ei , we have from Axiom
3 that


i 1
i 2
P( S )   P( Ei )  P( S )   P() ,
implying that P ()  0 .
2. For any finite sequence of mutually exclusive events
E1 , , En ,
n
P(
i 1
n
Ei )   P ( Ei ) .
i 1
Proof: Let Ei   for i  n . The results follows from
Axiom 3 combined with the fact established above that
P ()  0 .
Examples of probability functions
Example 3: If a die is rolled and we suppose that all six
sides are equally likely to appear, then we would have
P({1})  P({2})  P({3})  P({4})  P({5})  P({6}) 
10
1
6.
The probability of rolling an even number would equal,
from Axiom 3,
1
P({2, 4, 6})  P({2})  P({4})  P({6})  .
2
Example 4: A die is loaded in such a way that the
probability of any particular face’s showing is directly
proportional to the number on that face. What is the
probability that an even number appears?
To solve this requires that we make use of Axiom 2 that
P( S )  1 . The experiment – tossing a die – generates a
sample space containing six outcomes. But the six are not
equally likely: by assumption,
P(" i " face appears)  P(i )  ki, i  1, , 6
where k is a constant. From Axiom 2,
6
6
6(6  1)
P
("
i
"
face
appears)

ki

k  21k  1 ,


2
i 1
i 1
i
P
("
i
"
face
appears)

which implies that k  1/ 21and
21 .
It follows then from Axiom 3 that the probability that an
even number appears is
2 4 6 12
P(even number)  P(2)  P(4)  P(6)   

21 21 21 21
C. Probability as a Measure of Belief (Section 2.7)
Another interpretation of probability, besides the frequency
interpretation, is that probability measures an individual’s
11
belief in the statement that he or she is making. This is
called subjective or personal probability. Consider the
question,
“What is the probability that the Philadelphia Eagles will
win the Super Bowl this year?”
It is hard to interpret such a probability using the frequency
interpretation because the football season can only be
played once. The subjective interpretation of a statement
that the Eagles have a probability of 0.1 of winning the
Super Bowl is that:
 If the person making the statement were offered a
chance to play a game in which the person was
required to pay less than 10 cents to buy into the game
and would win $1 if the Eagles win the Super Bowl,
then the person would buy into the game.
 By contrast, if the person making the statement were
offered a chance to play a game in which the person
was required to pay more than 10 cents to buy into the
game and would win $1 if the Eagles win the Super
Bowl, then the person would not buy into the game.
More generally, if E is an event, a person’s subjective
probability of P ( E ) has the following interpretation: For a
game in which the person will be paid $1 if E occurs,
P ( E ) is the amount of money the person would be willing
to pay to buy into the game. Thus, if the person is willing
to pay 50 cents to buy in, P( E )  .5 .
12
Note that this concept of probability is personal: P ( E ) may
vary from person to person depending on their opinions.
A rational person has a “coherent” system of personal
probabilities: a system is said to be “incoherent” if there
exists some structure of bets such that the bettor will lose
no matter what happens. It can be shown that a coherent
system of personal probabilities requires that the personal
probabilities satisfy Axioms 1, 2 and 3 (for details on this,
see Hogg, McKean and Craig, Introduction to
Mathematical Statistics, Chapter 11.1).
Thus, whether the probability function is interpreted as a
measure of belief or as a long-run relative frequency, its
mathematical properties (i.e., that it satisfies Axioms 1, 2
and 3 and their consequences) remain unchanged. All
results in this course are equally applicable to both the
frequency and subjective interpretations of probability.
13