Download Lect1_2008

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Randomness wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Probability & Statistics
Module 1, Lecture 1
Michael Partensky
1
Topics of Discussion (1)
 (1) The Laws of Chance… Are they possible?
Gambling – it’s serious; On the shoulders of Giants
 (2) Some applications of probability theory
Nature, games, information, economy
 (3) Are we wired to “do probability”?
Paradoxes, misconceptions, tricky question
 (4) Collecting the data
Random experiments, sample spaces, events,
operations on sample sets
 (5) The first definition of probability (P)
Axioms and properties of P, Frequency interpretation of P
 (6) Random variables and distribution function
Discrete and continuous variables. Examples. Coins and spinners.
 (7) First introduction to Mathematica
2
Topic 1: Are the laws of chance possible?
Scientific investigation of chance began with
the analysis of games of chance.
A French nobleman, the Chevalier de Mere, was a gambler who
asked his friend, the mathematician Pascal, to explain why some
gambling bets made money over time but others consistently
lost over time. Pascal wrote to his colleague Fermat and the
ensuing correspondence the foundation of probability was laid.
Probability initially wasn't considered a serious branch of
mathematics because of it's involvement in gambling.
3
On the shoulders of Giants
Pierre de Fermat,
French mathematician
(1601-1655)
Blaise Pascal,
Frenh mathematician
and philospher (16231662)
Christiaan Huygens,
Dutch astronomert,
mathematician,
physicist
(1629-1695)
Jacob Bernoulli,
Swiss mathematician
(1654-1705)
4
Their analysis has given birth to the
calculus of probabilities (CP).
The central fact of CP: if coin is tossed large
number of times, proportions of heads (or tails)
becomes close to 50%.
In other words, while the result of tossing one coin
is completely uncertain, a long series of tosses
produces almost certain result.
Transition form uncertainty to near certainty is an
essential theme in the study of chance.
5
Not only only from their achievements, but also from the errors …
D'Alembert argued, that if two coins are tossed,
there are three possible cases, namely:
(1) both heads,
(2) both tails,
(3) a head and a tail.
So he concluded that the probability of " a head
and a tail " is 1/3.
Was he right?
If he had figured that this probability has
something to do with the experimental frequency
of the occurrence of the event, he might have
changed his mind after tossing two coins more than
few times. Apparently, he never did so. Why? We
6
do not know.
Topic 2: Some applications of CP
Statistical physics including modeling of biological systems
Game theory
Decision making in business and economy
Information theory
Bioinformatics…
You name!
7
Topic 3: Are we “wired to do probability”?
Our brains are just not wired to do probability problems
very well
( Persi Diaconis, Harvard Univ. )
8
A famous statistician would never travel by airplane, because she
knew that probability of there being a bomb on any given flight was
1 in a million, and she was not prepared to accept these odds.
One day a colleague met her at a conference far from home.
"How did you get here, by train?"
"No, I flew"
"What about the possibility of a bomb?"
"Well, I began thinking that if the odds of one bomb are 1:million,
then the odds of TWO bombs are (1/1,000,000) x (1/1,000,000) =
10-12. This is a very, very small probability, which I can accept. So,
now I bring my own bomb along!"
Was she right?
9
Here is a cute brain-stretcher.
Adventures on the Moscow Subway*
Boris commutes to college by the Moscow subway which runs in a circle. The
school happens to be at the point on the circle that is exactly opposite to
where Boris boards the train. The trains run in both directions, and the
schedule is very regular.
The time intervals for trains in both directions are equal; for instance, if there
is an hour between the arrival of clockwise (cw) trains, there is also an hour
between the arrival of counterclockwise trains. Boris observed, however, that
he caught the cw trains
much more often than the ccw trains.,
despite the fact that his schedule was
irregular and he arrived at the
Station at random times.
Can you explain this?
T-station
*From “The chicken from Minks” by Y.B. Chernyak and R.M. Rose
School
10
Monty Hall problem
11
Paul Erdos, one of the best
mathematician of the 20-th
century, allegedly never
accepted the MH solution
12
Two envelopes problem
The setup: You are given two indistinguishable
envelopes, each containing a sum of money. One
envelope contains twice as much as the other. Let say the
amounts are A and 2A. You pick one envelope at random
and open it. Then you're offered the possibility to take
the other envelope instead. At this point you do not know
if your envelope contains A or 2A.
Should you swap the envelopes?
Hint:
Assume that you can repeat this experiment many times, each time with a new set
of envelopes. What would be the average gain from switching?
13
The second daughter
Consider two problems
(a) You make a new friend and you ask if she has any
children. Yes, she says, two. Any girls? Yes, she
says. What is the probability that both are girls?
(c) You make a new friend and you ask if she has any children. Yes,
she says, two. Any girls? Yes. Next day you meet her with a
young girl. Is this your daughter? Yes, she says.
What’s the probability that both of her children are girls?
14
Topic 4: Collecting the data
Random experiments
Term "random experiment" is used to describe any action whose outcome
is not known in advance. Here are some examples of experiments dealing
with statistical data:
o Tossing a coin
o Counting how many times a certain word or a combination of
words appears in the text of the “King Lear” or in a text of Confucius
o Counting how many humvees passed the Washington Bridge
between 12 and 12.30 p.m.
o Counting occurrences of a certain combination of amino acids in a
protein database.
o Pulling a card from the deck
15
Sample spaces, sample sets and events
The sample space of a random experiment is a set  that includes all possible
outcomes of the experiment. For example, if the experiment is to throw a die and
record the outcome, the sample space is
  {1, 2, 3, 4, 5, 6}
the set of possible outcomes.  describes an event that always occurs.
Each outcome is represented by a sample point in the sample space.
There is more than one way to view and experiment, so an experiment may have more
than one associated sample space.
For example, suppose you draw one card from a deck. Here are some sample spaces.
Sample space 1 (the most common) The space consists of 52 outcomes, 1 for each card.
Sample space 2 The space consists of just two outcomes, black and red,
Sample space 3 This space consists of 13 outcomes, namely 2,3,4,… 10,J,Q,K,A.
Sample space 4. This space consists of two outcomes, picture and non-picture.
In tossing a die, one sample space is {1,2,3,4,5,6}, while two others are {odd, even}
16
and {less then 3.5, more then 3.5}
The examples above describe the discrete sample spaces.
Intuitively, you all understand what it means although a
precise mathematical definition would be cumbersome.
We will discuss the continuous sample spaces a little later.
The examples can be represented by various physical
variables, such as weight, height, concentration etc.
17
Try it:
Race between seven horses,
1,2,3,4,5,6,7.
•
Experiment 1 : determining the winner.
•
Experiment 2: determining the order of finish
Describe the sample spaces
18
Events
Certain subsets of the sample space of an experiment are referred to as events. An
event is a set of outcomes of the experiment. This includes the null (empty) set of
outcomes and the set of all outcomes. Each time the experiment is run, a given event
A either occurs, if the outcome of the experiment is an element of A, or does not
occur, if the outcome of the experiment is not an element of A.
What to consider an event is decided by the experimentalist!
19
In the Sample space of 52 cards the following events can be considered
among others: A = “drawing a king”, B = “drawing Q or A”, C = “drawing a nonpicture”, D = “drawing a black”.
For example, event D consists of 26 points.
D is also an event in sample space 2 (consisting of 1 point).
It is not an event in sample spaces 3 and 4 (p. 9).
Similarly, A (“king”) is an event in sample spaces 1 (4 points) and 3 but not in 2 and
4.
Sample space 1 The space consists of 52 outcomes, 1 for each card.
Sample space 2 The space consists of just two outcomes, black and red,
Sample space 3 This space consists of 13 outcomes, namely 2,3,4,… 10,J,Q,K,A.
Sample space 4 This space consists of two outcomes, picture and non-picture.
20
The examples of sample spaces and events
Example 1.1
Flip two coins. Try to figure out what is the sample space for this experiment How
many simple events does it contain?
The answer could be described by a following table:
1\2
H
T
H
HH
HT
T
TH
TT
21
Example 1.2
Role two dice. For convenience we assume that one is red and the other is green
(we often use this trick which implies that we can tell between two objects
which is which). The following table describes 
1
2
3
4
5
6
1
11
21
31
41
51
61
2
12
22
32
42
52
62
3
13
23
33
43
53
63
4
14
24
34
44
54
64
5
15
25
35
45
55
65
6
16
26
36
46
56
66
22
Example 1.2 (continued) . Any event corresponds to some collection of the
cells of this table. We show here four different events (colored blue, red, green
and gray) . Try to describe them in plain English
1
2
3
4
5
6
1
2
3
4
5
6
11
21
31
41
51
61
12
22
32
42
52
62
13
23
33
43
53
63
14
24
34
44
54
64
15
25
35
45
55
65
16
26
36
46
56
66
23
Example 1.2 (continued) . And what about the following events?
Which of these event (red, orange, purple, etc) occurs more often?
1
2
3
4
5
6
1
2
3
4
5
6
11
21
31
41
51
61
12
22
32
42
52
62
13
23
33
43
53
63
14
24
34
44
54
64
15
25
35
45
55
65
16
26
36
46
56
66
24
Example 1.3 An experiment consists of drawing two balls from the box of
balls numbered from 1 to 5. Describe the sample space if
(a) The first ball is not replaced before the second is drawn.
(b) The first ball is replaced before the second is drawn.
Example 1.4 Flip three coins. Show the sample space. What is the total
number of all possible outcomes?
25
Composite events
Quite often we are dealing with composite events. Example. We
study a group of students, picking them at random and considering
the following events: A = “A student is female”, B=“A student is
male”, C= “A student has blue eyes”, D=“A student was born in
California”.
After this information was collected and the probabilities of A. B, C
and D were determined, we decided to find the probabilities of some
other events:
U= “Student is female with blue eyes”, V=“Student is male, or
has blue eyes and was not born in California”, etc.
The latter questions are dealing with the events that are composed
from the “atomic” events A, B, C and D. To describe them, it is
convenient to use a language of the Set Theory.
Warning: please, don’t be scared. We are not going to be too
theoretical. The new symbols will be introduced for our
convenience.
26
Operations on sample sets
Inclusion: A is a subset of B
“Occurrence of A implies the occurrence of
B” or “A is a subset of B”
(1) Example: B = {2,3,5,6}, A={2,6}
(2) “Boys” is a subset of “Males”
B
AB
A
A
B
Intersection: A B belongs to A and B *
Examples:
(1) B = {2,3,4,6}, A={1,2,5,6}
A B ={2,6};
(2) A=“Female students”, B=“Students having blue
eyes”, A B = “Female students with blue eyes”.
Usually, instead of
the multiplication sign
A
B
A
B
is used: A B or simply AB .
27
The empty set  is the event with no
outcomes. The events are disjoined if
they do not have outcomes in common:
A B
Example: A={1,3,4}, B= {2,6}
The union of events A and B, A  B is the
event that A or B or both occur. Example: A
= “Male student”, B= “Having blue eyes”,
 Bare either male or have
=“StudentsAwho
blue eyes (or both)”
Complementary (opposite) events
Ac is the compliment to A if A AC  
and A
AC  
A
B
B
A
AB
A

Ac
28
Try it:
Let A be the event that the person is male, B that the
person is under 30, and C that the person speaks French.
Describe in symbols:
• A male over 30;
• Female who is under 30 and speaks French
• A male who either is under 30 or speaks French
29
Topic 5: Probability, its definition, axioms
and properties
The first definition of probability
We have introduced some important concepts:
 Experiments,
 Outcomes,
 Sample space,
These concepts will come to life after we introduce
another crucial concept- probability, which is a
way of measuring the chance.
A probability is the way of assigning numbers to
events that satisfies following conditions or …
30
Topic 5: Probability, its definition, axioms
and properties
5.1 Axioms and properties of probability
We have introduced some important concepts:
 Experiments,
 Outcomes,
 Sample space,
These concepts will come to life after we introduce another
crucial concept- probability, which is a way of
measuring the chance.
A probability is the way of assigning numbers to events that
satisfies following conditions or …
31
Impossible event
sure event
The Axioms of Probability
(1) For any event A
0  P( A)  1
(2) If  is the sample space, then
P()  1
(1.1)
(1.2)
(3) For a sequence of disjoined (incompatible) events Ai (finite
or infinite),
P ( i Ai )   P ( Ai )
(1.3)
i
In other words, the probability for a union of disjoined events equals the
sum of individual probabilities. For 2 incompatible events (1.3) gives
P( A B)  P( A)  P(B)
(1.3')
We will introduce here another important property, although it is not an
Axiom and will be justified later in terms of the conditional probability .
32
(4) If A and B are independent, then
P( A  B)  P( A)P(B)
(1.4)
In other words, for any number of independent
events, N,
P( A j )   j P( A j ) , j  1,2,..., N
(1.5)
33
Some other properties of probability. Deduction
from the axioms 1-3: c
(a). Given that A +A  we find from (3):
P(A)=1 - P(Ac )
(1.6).
(b) if A  B, then P(A)  P (B)
(c) P(A  B)  P ( A)  P (B)  P ( AB)
(1.7)
(1.8)
Try to prove (a) and (b). The property (c) is harder to prove
formally, although intuitively it is quite clear. Summing up the
areas of A and B, one counts their intersection twice. The
probability is kind-a proportional to the area. Therefore, one of
the occurrences of AB should be removed.
A AB
B
34
We introduced two new concepts:
incompatible (disjoined) and independent events
•Two events are said to be incompatible or disjoined if they can not occur
together.
•Two events are said to be independent if they have “nothing to do” with each
other.
35
Examples:
• (Disjoined) Jack can arrive to Boston either on Monday (“M”) or on Wednesday
(“W”). These events can not occur together (he can not arrive on Monday and on
Wednesday). Therefore the events are disjoined. If P(M)=0.72 and P(W)=0.2, then
P(M or W)=0.92 (there is still a chance that Jack won’t come to Boston at all).
• (Independent) A= “It will be sunny today”, B=“Celtic will win the game
tomorrow”
. What is the probability that both events will occur simultaneously?
P(A and B)=P(A) P(B).
Please, offer some more examples of both kinds.
You can find a very useful and provocative discussion of these concepts and of
the probability in general in the book “Chance and Chaos” by David Ruelle
(Princeton Sci. Lib.)
36
Comment
In many cases it is easier to find a probability of a compliment event Ac
rather then A itself. It such cases, the rule (1.7) can be very helpful.
Example: Toss two dice. Find the probability that they show different
values, for example (4,6) and (2,3) but not (2,2).
You can count favorable outcomes directly, or better still, by (3)
P(non-matching dice) = 1 –P(matching dice) = 1 – 6/36 = = 5/6.
We will be using this trick quite often.
37
Example: the classical “birthday problem”.
Find a probability that in a group of n people at least two will
have the same birthday.
You will solve a version of this problem at home.
Let’s discuss here a simpler problem.
N people arrive to a vacation place during the same week. Each one chooses
randomly his/her arrival day. What is the probability that at least two people
arrive on the same day?
Let’s discuss it together (working in groups)
What is the complimentary event?
•Consider first N=2,
•N=8.
•Now let us check N=3.
•The general case
38
5.2 Two definitions of probability
Combinatorial definition
If a random experiment has N equally possible
outcomes of which NA outcomes correspond to an event
A, then
P( A) 
NA
N
39
Frequency definition (empirical)
If we repeat an experiment a large number of times, then
the fraction of times the event A occurs will be close to P(A):
N( A, n)
P ( A)  limn
(1.9)
n
In other words, if N(A,n) is the number of times that event occurs
in the first n trials, then It's easy to prove that defined this way, P(A)
satisfies conditions
(1) and (2) Try it !
Hint: the property (3) follows from
N(
i
Ai , n )   N ( Ai , n )
i
40
Random Variables and the Distribution Function.
The simplest experiments are flipping coins and throwing dice.
Before we can say anything about the probabilities of their various
outcomes (such as "Getting an even number" on the die, or "Getting
3 heads in 5 consecutive experiments with a coin") we need to
make a reasonable guess about the probabilities of the elementary
events (getting H or T for a coin , or one of six faces for a die). For
instance, we can assume (as we usually do) that a die is perfectly
balanced and all faces are equivalent. Then , the probability of any
number equals 1/6.
As we will see, it can be described in terms of distribution (because
it distributes probabilities between different outcomes) function.
The term “function”, however, implies some arguments (or
variables). It would be inconvenient to use a “function of faces”, or a
“functions of Heads or Tails”. That’s why we will introduce a general
41
term for describing various random outcomes.
Random Variables
We now introduce a new term
Instead of saying that the possible outcomes are 1,2,3,4,5
or 6, we say that random variable X can take values
{1,2,3,4,5,6}.
A random variable is an expression whose value is
the outcome of a particular experiment.
The random variables can be either discrete or continuous.
It’s a convention to use the upper case letters (X,Y) for the
names of the random variables and the lower case letters (x,y)
for their possible particular values.
42
Examples of random variables
Discrete: (you name !)
Continuous:
For instance, weights or a heights of people chosen
randomly, amount of water or electricity used during a day,
speed of cars passing an intersection.
Please, add a few more examples.
43
The Probability Function for discrete
random variables
We assigned a probability 1/6 to each face of the dice. In the same
manner, we should assign a probability 1/2 to the sides of a coin.
What we did could be described as distributing the values of probability
between different elementary events:
P(X=xk)=p(xk), k=1,2,…
(1.10)
It is convenient to introduce the probability function p(x) :
P(X=x)=p(x)
(1.11)
In other words, the probability of a random variable X taking a particular
value x
Is called the probability function.
For x=xk (1.10) reduces to (1.9) while for other values of x, p(x)=0.
44
The probability function should satisfy the following equations :
p( x )  0
 p( xi )  1
i
P (E ) 
(1.12)
(1.13)
 p( xi ) (1.14)
i E
Example:
Suppose that a coin is tossed twice, so that the sample space is
={HH,HT,TH,TT}.
Let X represent a number of heads that can come up. Find the
probability function p(x).
Assuming that the coin is fair, we have P(HH)=1/4, P(HT)=P(TH)=1/4,
P(TT)=1/4;
Then, P(X=0)=P(TT)=1/4; P(X=1)=P(HTTH)=1/4+1/4=1/2.
P(X=2)= ¼.
45
The probability function is thus given by the table:
x
0
p(x) 1/4
1
2
1/2
1/4
46
Uniform and non-uniform distributions.
If all the outcomes of an experiment are equally probable, the
corresponding probability function is called uniform. If the
contrary is true, the probability function is non-uniform.
Example On the face of a die with 6 dots, 5 dots are filled, so that only the
central one is left. What is the probability function for this case? The value
X=1 will occur in average 2 times more often than in the balanced die. As a
result, p(1)=1/3, p(2)=p(3)=p(4)=p(5) = 1/6 .
Working together
Try It Suppose we pick a letter at random from the word TENNESSEE. What is
the sample space and what is the probability function for the outcomes?
Challenge: For two dice experiment, find the probability function for
X=“ Sum of two throws”. Use the table (it’s reproduced below)
47
Continuous distribution (preliminary remarks)
We will repeat this discussion later, with many extra details
 f ( xi )  1
i
P (E ) 
(1.5.1)
 f ( xi )
(1.5.2)
i E
Suppose that the circle has a unit circumference (we simply use
units in which 2 Pi R=1).
48
Continuous distribution. Probability density function (PDF).
Suppose that every point on the circle is labeled by its distance x
from some reference point x=0. The experiment consists of
spinning the pointer and recording the label of the point at the tip
of the pointer. Let X is the corresponding random variable. The
sample space is the interval [0,1). Suppose that all values of X are
equally possible. We wish to describe it in terms of probability. If
it was a discrete variable (such as a dice), we would simply
assign to every outcome a fixed value of probability to all
outcomes.. p(xi)=const.
However, for a continuous variable we must assign to each
outcome a probability p(x )=0. Otherwise, we would not be able
to fulfill the requirement 1.12.
Something is obviously wrong!
49
We will come back to the continuous distributions in Lecture 2, after
refreshing some Math.
Let’s us now open the Mathematica (the rest of the lecture we will
practice with M).
50
Self-Test
1.
For what number of students n the probability of at least two having the same
birthday reaches 0.7?
2.
A card is drawn at random from an ordinary deck of 52 playing cards. Describe
the sample space if consideration of suits (hearts, spades, etc) (a) is not, (b) is,
taken in the account.
3.
Referring to the previous problem, let A="a king is drawn" and B="a club is
drawn". Describe the events (a)AUB, (b) AB, (c) ABc (d) Ac  Bc (d) (AB)
(ABc)
4.
Describe in words the events specified by the following subsets of
 = {HHH,HHT,HTH,HTT,THH,THT,TTH,TTT}
(a) E = {HHH,HHT,HTH,HTT}; (b) E = {HHH,TTT}; (c) E= {HHT,HTH,THH};
(d) E = {HHT,HTH,HTT,THH,THT,TTH,TTT}
5. A die is loaded in such a way that the probability of each face turning up is
proportional to the number of dots on that face (for instance, six is three times as
probable as two). What is the probability of getting an even number in one throw?
51
6. Let A and B be events such that
P( A B) 1/ 4, P( A) 1/ 3, and P(B) 1/ 2.
What is the P(AUB)?
7. A student must choose one of the subjects: art, geology, or psychology as an
elective. She is equally likely to choose art or psychology, and twice as likely
to choose geology.
What are the respective probabilities for each of three subjects?
52