Download 2.2 Random Variable

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gambler's fallacy wikipedia , lookup

Transcript
12. Examples: On the night of March 1, 1986, in Lorain, Ohio, John Doe was struck by
a speeding taxi as he crossed the street. The taxi was driving the wrong way down
a one-way street and did not stop. An eyewitness thought that the taxi was blue.
Lorrain has only two taxi companies, Blue Cab and Green Cab. 85% of the taxis are
Green Cab. Tests have shown that the eyewitness is perfect in distinguishing cabs
from private automobiles, but under conditions approximating those of the night of
the accident he was able to identify the correct color of a taxi 80% of the time. Suppose
you are the judge and suppose that “preponderance of the evidence” is interpreted to
mean a probability of 50 percent, do you think it is more likely than not that the taxi
was blue rather than green? [answer:
P (B|Eye witness says it is blue)
P (eye witness says it is blue|B) P (B)
=
P (eye witness says it is blue|B) P (B) + P (eye witness says it is blue|G) P (G)
0.8 × 0.15
=
= 41.38%.
0.8 × 0.15 + 0.2 × 0.85
2.2
Random Variable
1. A random variable (r.v.) X is a function from a sample space S into the real numbers.
The sample space of the r.v., X , is the range of the r.v.
• Why do we want to introduce random variables? The reason is that in many
experiments it is easier to deal with a summary variable than with the original
probability structure. For example, in an opinion poll, we may ask 50 people
whether they favors Gore or Bush. If we record a ‘1’ for Gore and ‘0’ for Bush,
the sample space for this experiment will have 250 elements (too big!!). It may
be that the only quantity of interest is the number of people who favors Gore,
then we can define a random variable X = the number of 1s recorded out of 50.
The sample space for X is the set of integers {0, 1, ..., 50} .
• Example: “toss two coins”. Recall the original sample space {HH, HT, T H, T T } .
We can define a r.v. as the number of heads in the two tosses, which will correspond to the following map:
X (HH) = 2
X (HT ) = X (T H) = 1
X (T T ) = 0.
10
The number of heads in the two tosses is random (or stochastic) because it
depends on the outcome of the random experiment.
2. Notational convention: r.v. will always be denoted by upper case letters, and the
realized values of the r.v will be denoted by the corresponding lowercase letter. E.G.
r.v. X can take the value x.
3. Suppose that we have a random experiment with sample space S = {s1 , ..., sn }
and probability measure P. Now suppose that we define a r.v. X with range X =
{x1 , ..., xm } . How can define a probability function PX on X ? We will observe X = xi
iff the outcome of the experiment is an sj such that X (sj ) = xi . Hence
PX (X = xi ) = P (sj ∈ S : X (sj ) = xi ) .
• Example: Consider an experiment of tossing a fair coin three times. Define the
r.v. to be the number of heads obtained in the three tosses. What is S? What
is X and X ? What is PX (1)?
4. If the range of X is finite or countably infinite, we call the r.v. a discrete r.v.; A
continuous r.v. is an r.v. that can take on any value in some interval; a r.v. is called
mixed if restricted to some range it is discrete, and restricted to another range it is
continuous.
2.3
Cumulative Distribution Function (cdf)
1. The c.d.f. of a random variable X, denoted by FX (x) is defined by
FX (x) = PX (X ≤ x) for all x.
• Example: Consider the experiment of tossing three fair coins, and let X =
number of heads observed. The c.d.f. of X is

0 if x ∈ (−∞, 0)




 1/8 if x ∈ [0, 1)
1/2 if x ∈ [1, 2) .
FX (x) =


7/8
if x ∈ [2, 3)



1 if x ∈ [3, ∞)
[step function graph representation] Note that FX satisfies: [these are general
properties of c.d.f.: any function FX that satisfies the following three properties
are c.d.f of some random variable]
11
— limx→−∞ FX (x) = 0, limx→∞ FX (x) = 1;
— FX (x) is nondecreasing in x;
— FX (x) is right continuous.
• Example: Suppose we do an experiment that consists of tossing a coin until a
head appears. Let p = the probability of a head on any given toss. Define a r.v
X = number of tosses required to get a head. Then for any x = 1, 2, ...,
P (X = x) = (1 − p)x p
since we must get x − 1 tails followed by a head for the event to occur and all
trials are independent. Hence for any positive integer x,
P (X ≤ x) =
x
X
P (X = i) =
i=1
x
X
i=1
(1 − p)i−1 p
1 − (1 − p)x
=
p = 1 − (1 − p)x , x = 1, 2, ...
1 − (1 − p)
Hence FX (x) = 1 − (1 − p)x for x = 1, 2, ... [write out the complete FX ] This is
called geometric distribution.
• Example: consider a continuous c.d.f [logistic distribution]
FX (x) =
1
.
1 + e−x
why is this a c.d.f? [verify the three defining properties of c.d.f. function].
2.4
Probability Density Function [pdf] and Probability Mass Function
[pmf]
1. Associated with a r.v. X and its cdf FX is another function, called probability mass function
(pmf) for discrete r.v. and probability density function for continuous r.v.. Both pdf
and pmf are concerned with “point probabilities” of r.v.s. Notational convention: we
use an uppercase letter for the cdf and the corresponding lowercase letter for the pmf
or pdf.
2. The pmf, fX (x) ,of a discrete r.v. X is given by
fX (x) = P (X = x) for all x.
• The pmf for the geometric distribution is
½
(1 − p)x−1 p for x = 1, 2, ...
fX (x) = P (X = x) =
0 otherwise
12
• For a discrete r.v., to get the cdf from the pmf, we do the following:
FX (x) = P (X ≤ x) =
x
X
fX (z) .
z=−∞
3. The pdf, fX (x) , or a continuous r.v. X is the function that satisfies
FX (x) =
Z
x
fX (t) dt for all x.
−∞
When fX is a continuous, then by the Fundamental Theorem of Calculus, we have
fX (x) =
dFX (x)
.
dx
• Example: For the logistic distribution, we have its cdf
FX (x) =
hence its pdf is
fX (x) =
1
1 + e−x
dFX (x)
e−x
=
dx
(1 + e−x )2
The area under the curve fX (x) gives us the interval probabilities [graph representation]
P (a ≤ X ≤ b) = FX (b) − FX (a) =
Z
b
fX (x) dx.
a
4. The support of a r.v. X is defined as:
Supp [X] = {x : fX (x) > 0.}
That is the support of a r.v. is the values that can arise with positive density.
5. How to check whether a function fX is a pdf (or pmf) or some r.v.? It is a pdf (or
pmf) if and only if
• fX (x) ≥ 0 for all x;
R∞
P
•
x fX (x) = 1 (for pmf) or −∞ fX (x) dx = 1 (pdf).
6. Notation: The expression “X has a distribution given by FX (x) ” is abbreviated
symbolically by “X ∼ FX (x) ” where we read the symbol “∼ ” as “is distributed as”.
We can similarly write X ∼ fX (x) .
13