Download Random variables and random numbers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
MSc Regulation and Competition
Quantitative Techniques
QT week 2
In this section
1. random numbers
2. introduction to probability; different types of outcome
3. the uniform distribution (probability density function)
4. some basic theorems
Random variables and random numbers
Why are they relevant to data analysis?
What do we mean when we say something is random?
Answer: We cannot predict outcome exactly
Numbers in economics are generated to an extent like other kinds of numbers
such as the outcome of a gamble.
Roulette wheel
Coin toss
etc.
For this reason, statistics courses often start out talking about the
mathematics of games of chance. This is called probability theory.
Some elements of probability theory
Outcomes of games of chance fall within a set of possibilities, or event space
or domain e.g.
H or T
0-36
Any real number between zero and 1
The set of all positive real numbers
The first two are discrete, finite domains.
The last is continuous and contains infinitely many possible outcomes.
For finite domains we can talk about the probability that any particular
observation will fall to a certain outcome. e.g. P(H) or P(T)
Sometimes written as Pr(H) etc.
For continuous/infinite sample spaces we need to do something else (see
below.)
City University
1
MSc Regulation and Competition
Quantitative Techniques
What is probability?
Probability may be defined in two ways:
1.) By prior specification:
A “fair” coin will have P(H) = ½
2.) By empirical observation?
What proportion of throws is Heads?
Problem is, the empirical observation needs an infinite number of trials to get
totally accurate. The more trials, the more accurate.
But from the idea of an empirical measure comes the following definition:
The probability of an outcome A is the
limit,
as the number of trials goes to infinity,
of the proportion of trials with outcome A
The idea of the limit is that the proportion settles down and wanders around
less and less as the number of trials increases
What about continuous sample spaces?
e.g. consider a uniform distribution between 0 and 1, as generated by the
function Rand() in Excel.
The probability of any particular value e.g. exactly 0.5 is..... zero.
But we can talk about the probability that it will fall
between 0 and 0.1..............................
between 0 and 0.4...............................
between 0.2 and 0.8.........................
(Students to answer)
Figure 1: Uniform distribution between 0 and 1
Probability
density
0
1
City University
Outcome x
2
MSc Regulation and Competition
Quantitative Techniques
Figure 2: Probability as area under pdf
Relative
probability
(probability
density)
Outcome x
0
0.4
1
How high does the line have to be so that the shaded area represents
P(x<0.4)?
What is the total area under the probability density function [p.d.f.] between 0
and 1?
Suppose now we have a uniform distribution between 0 and 2: what is the
height of the p.d.f?
What is the average of x for the first uniform distribution? the second?
Can we find a region around the average where 90% of the observations are
likely to lie?
95%?
We shall see that these regions are important in statistical analysis.
Some basic theorems
A is a possible outcome with probability P(A)
Theorem 1. If à represents every possible outcome that is not A then P(Ã) =
1- P(A).
In other words P(A) +P(Ã) = 1
e.g. coin tossing P(H) + P(T) = 1
uniform distribution: A = {0x<0.4} then Ã= {0.4x1}
Comment: A and à are said to exhaust all the possibilities.
Theorem 2. If A and B are mutually exclusive
P(AorB) = P(AUB) = P(A) + P(B).
e.g. if A = {0x<0.4} and B = {0.8x<1}
City University
3
MSc Regulation and Competition
Quantitative Techniques
P(A) =0.4, P(B) = 0.2, P(A or B) = 0.6
Die throwing: A = {6}, B = {1}, P(A) =1/6, P(B) = 1/6 P(1 or 6) = 2/6
Note the expression A or B is sometimes written AUB. Likewise the event A
and B is sometimes written as the intersection A∩B.
A Venn diagram illustrates:
Figure 3: Venn diagram
Event or
sample
space
A
A∩B
B
Draw a boundary round (AUB)
Outcomes, or groups of outcomes can be represented by sets of points in the
sample space. If we go further, and give the whole sample space an area of
1, then areas of the outcomes can be taken to represent probabilities. (Not all
examples of Venn diagrams have this strict interpretation.)
The diagram is useful in understanding some of the other rules.
We can now see that rule 2 applies to the special case where A∩B is null i.e.
does not exist in the event space because A and B are mutually exclusive so
P(A∩B)=0.
Where they are not mutually exclusive we obtain
Theorem or rule 3. The rule of addition for non-mutually exclusive events
P(A or B) =P(AUB) = P(A) + P(B) – P(A∩B).
In terms of the intuitive Venn diagram if we do not subtract P(A∩B) we shall
be double counting this area.
There are two more useful rules:
Rule 4. Rule of multiplication for independent events
Suppose the trial consists of throwing a die twice
Let A = {6 on first throw} P(A)= 1/6
B = {6 on second throw} P(B) =1/6
City University
4
MSc Regulation and Competition
Quantitative Techniques
Theorem 4: If the events A and B are statistically independent then
P(A and B) = P(A) times P(B)
In the above example P(6 on both throws) = (1/6) x (1/6) =1/36
Rule 5. Rule of multiplication for dependent events
P( A and B) = P(A given B) P(B) or, using more formal notation:
P(A∩B) = P(A| B) x P(B)
Switching A and B in Rule 5 gives us
P(B∩A) = P(B| A) x P(A)
Conditional probability
Expressions like P(A| B) are called conditional probabilities.
Using the Venn diagram in Figure 3 we can see what P(A|B) ought to be:
P(A|B) = P(A∩B)/P(B)
In fact a little simple algebra on this definition gives us the multiplication rule
above!
Exercise
Do problems 2, 3, 9 and 10 out of Chapter 2 of Ashenfelter, Levine and
Zimmerman (page 24).
Then attempt the following. This is a simplified version of a problem for a
University entrance test from the 1960s.
Two chimpanzees, Mitzi and Maurice, are playing Tic Tac Toe or noughts and
crosses. The winner is the first to get three 0s or Xs in a row.
Maurice goes first. What is the probability that he will win with the following
outcome after just three moves? (Mitzi’s 0s are not shown but could be
anywhere except where the Xs are)
X
X
X
A more challenging problem. How many ways can Maurice win in three
moves? Are they mutually exclusive? Independent?
Can you use the above theorems to calculate the probability that Maurice will
win in three moves?
City University
5
MSc Regulation and Competition
Quantitative Techniques
What is the probability that Mitzi will win in three, despite moving second.
References
Ashenfelter, Levine and Zimmerman Statistics and Econometrics Chapter 2.
See e.g. in the Schaum’s outline book pp. 37-38 (Salvatore and Reagle
Statistics and Econometrics.)
City University
6