Download Random Variables density and distribution function Part 1 Discrete case MSIS 385

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Random Variables
density and distribution function
Part 1 Discrete case
MSIS 385
Fall 1999
Farid Alizadeh
Faculty of Management
Rutgers University
September 14, 1999
In this note I will review the concepts of Random variable, density and distribution functions. This is a minimal set of concepts that you should be familiar
with in order to grasp the other statistical concepts that you will encounter in
the course.
1
Sample space and events
(Please review chapter 6 of your text.) In the situations that arise for statistical
analysis we are generally dealing with some sort of experiments and expect
a variety of outcomes. The term sample space refers to the collection of all
possible outcomes that could possibly arise from an experiment. For example
if we are throwing a coin then the possible set of coutcomes is heads or tails,
thus the sample space is {head, tail}. If you are throwing a regular (that is cubic
shaped) die with each face bearing one, two, ..., or six dots, then the sample
space consists of 6 possible faces. When a polling company annouces the results
of a, say, nationwide poll on presidential candidates it generally implies that its
poll takers have chosen a random sample of people among citizens of the United
States; thus here the sample space is the all the citizens of the US. Finally, when
we talk about the temperature in New Brunswick at noon on October 1st, 1999,
the sample space here the set of all possible numbers that can turn up, in this
case an infinte set of real numbers anywhere between say, 0F to 120F. Thus
sample spaces may be infinte or finte.
An event is a portion of sample space that may be of interest to us. WHen
we toss a coin and we want to know if the outcome is heads or not, the event
under consideration is the singleton set {head}, a subset of the sample space.
1
When we throw a die and would like to know if the outcome is a face with three
or fewer dots then the event of interest consistes of those faces with one, two,
or three dots. Finally if we are interested to know whether the temperaqture in
New Brunswick on October 1st 1999 will be between 60 and 80 degrees, then
the infinite set of real numbers between 60 and 80 make up our event.
2
Random variables
Sample spaces and events do not necessarily have numerical content. A sample space may be made up of people, cities, playing cards and so on. In many
cases statistics deals with numerical content. The concept of random variable
is invented to give numerical content to smaple spaces and events. A random
variable is function that associates to each element of sample sapce a real number. At this point we impose no restriction on the numbers themselves, however
particular situations may pose additional restriction such the number may be
required to nonnegative or integral. For example consider the set of al Rutgers
full time students in the current semester. We may be interested in their age.
Consider the experiment of choosing a student at random and enquiring about
his or age. In this case age is arandom varibale on the sample space of Rutgers
students: It associates to students their age, a number.
In another experiment we may want to throw a coin three times and we may
be interested the number of times a head turned up. Here the sample space
consists of 8 elements:
S = {hhh, hht, hth, htt, thh, tht, tth, ttt}
The random variable “number of heads that turn up” is any one of the numbers
0, 1, 2 or 3. If we denote this random varible by the symbol H, then H(hhh) = 3,
H(hht) = H(hth) = H(thh) = 2, H(htt) = H(tht) = H(tth) = 1 and H(ttt) =
0, (remember a random variable is a function defined on the sample sapce, thus
we are justified to write H(hht) = 2.) Whenever a sample space has a finite
number of elements in it, then every random variable defined on it is a discrete
random variable. It is possible, however, that the sample space and even the
set of possible values attained by the random variable is infinite. In this case
not every arbitrary function is acceptable as a random varibale. We restrict
ourselves to continuous random variables in the infinte case; we will define this
term shortly.
3
Density and Distribution functions for Discrete random variables
Two fundamental functions are definable on most random varibales: the probability desnsity function (pdf for short) and the cumulative distribution function
(cdf or simply the distribution function for short.) We will first define these
concepts for discrete random variables.
2
Figure 1: A random variable indeciating number of heads in a sequence of three
coin tosses
Again consider the experiment of tossing a coin three times and the random
varible which indicates the number of heads that turn up. Assume that the coin
is a fair and thus the probability of a head turning up at each toss is equal to
probability of a tail turning up. This implie that in throwing our coin three time
all the 8 possible outcomes in the sample space S as describes above is equally
likely. (Note that 8 = 2 × 2× = 23 ) Now consider the following questions and
their answers:
1. what is the probability that no heads will turn up? In other words
we need to calculte Pr[H = 0]. Since the event that there are no heads is
E0 = {ttt} which has only 1 element , the answer is
Pr[H = 0] =
number of elements in E0
= 1/8
number of elements in S
2. what is the probability that only one head turns up? In other
words we need to find out Pr[H = 1]. The event of interest here consists
of all outcomes with only one head, that is E1 = {htt, tht, tth}. Thus,
Pr[H = 1] =
number of elements in E1
= 3/8
number of elements in S
3. what is the probability that two heads turn up? In other words
what is the value of Pr[H = 2]. The event of interest is all outcomes with
two heads that is E2 = {hht, hth, thh}. Thus,
Pr[H = 2] =
number of elements in E2
= 3/8
number of elements in S
3
4. what is the probability that three heads turn up? In other words we
want Pr[H = 3]. The event of interest again consists of only one element:
E3 = {hhh}. Thus
Pr[H = 3] =
number of elements in E3
= 1/8
number of elements in S
What we have calculated above is the probability density function of the random
variable H, the number of heads in three tossings of a fair coin.
Definition 3.1 pdf is function that associates to each possible value of a discrete random variable a number that is equal to probability of that number being
attained by the random variable assuming all elemnts of the sample space are
equally likely. Thus id f is the probability density function of a random variable
X, then
f(x) = Pr[X = x]
For instance in the above example if the probability density function of the
random varibale H is f, then f(0) = 1/8, f(1) = 3/8, f(2) = 3/8 and f(3) = 1/8.
• the pdf of a random varibale is a function that takes a numerical value as
input and returns a real number between 0 and 1 inclusive; this number
is aprobability.
• If we add up all the f(x) for all possible value of the random variable, the
sum is one,
X
f(x) = 1
x
In the example above f(0) + f(1) + f(2) + f(3) = 1/8 + 3/8 + 3/8 + 1/8 = 1.
• Depending on the situation, the function f may be given in a variety
of formats. Sometimes we simply tabulate the values: for each possible
value of x we provide f(x). This is OK for very small sample spaces,
but becomes very cumbersome or practically impossible for larger sample
spaces. Another alternative is to provide a formula or even a procdure for
calculating f(x) given x. Some times the formula or procesure is called the
probability law of the random variable. For instance you may hear “the
random variable X follows the binomial law.”
Let us now ask a set of new questions regading our tossing the fair coin three
times:
1. what is the probility that two or fewer heads turn up? in other
words what is the value Pr[H ≤ 2]? Since the set of outcomes that have
two or fewer heads in them is E = {ttt, tth, tht, htt, hht, hth, thh} then
the answer is 7/8. A better way is to observe that the event H ≤ 2 is
4
the same as H = 0 or H = 1 or H = 2; since we have already calculaed
the probilities of these three events we can simply add them to get outr
answer:
Pr[H ≤ 2] = Pr[H = 0] + Pr[H = 1] + Pr[H = 2] = 1/8 + 3/8 + 3/8 = 7/8.
the event
2. what is the probility that the number of heads is smaller that
1.3? This is a legitimate question even though the number of heads is
always an integer, and never equal to 1.3. In this case Pr[x ≤ 1.3] =
Pr[H ≤ 1] = 1/8 + 3/8 = 1/2.
what we have done above is to calculate the cumulative distribution function
of H. The cumulative distribution function or simply the distribution or cdf
function of a random varible X is a function that–just like pdf –takes a real
number from the set of all possible values of a random variable and returns
another real number between zero and one. If the pdf is f(x) then cdf is F(x)
and is defined as:
F(x) = Pr[X ≤ x]
The concepts of probaility density function and distribution functions are
entirly determined by the values of the random variable and the formulas or
probability laws that describe the functions F(x) and f(x). As such they don’t
directly depend on the sample space itself. Mathmeaticians and statisticians
spend time studynig and analyzing the properties of various distributions. In
practical and real world situations our job is to identify an appropriate sample
space and events, and define a suitable random variable and then idenify which
one of the known and well-studied distribution laws applies to this particular
experiment.
3.1
Binomail distribution
To illustrate this point let us now study on of the most important discrete
distributions laws that we will encounter repeatedly in the future, that is the
binomial distribution law. This is a generalization of our coin tossing experience.
Suppose that an experiment has two outcomes Success (S) and Failure (F).
Furthemore, assume that in our experiment Success or Failure occur randomly
with Pr[Succes] = p and Pr[Failure] = q = 1 − p. Suppose that the experiment
is conducted n times and we are interested in the number of successes. Here is
the setup in anutshell:
• Parameters: The numbers p and n are parameters, they could be different in each experiment and any formuls we derive may depoend on
them.
• Sample space: There are 2n possible outcomes that can possibly occur,
these outcomes may be represented by a sequenc of S and F letters. For
5
instnace if n = 2 in two experiments we may have the 22 = 4 outcomes
{SS, SF, FS, FF}. These 2n outcomes constitue our sample space.
• Random variable: The random variable is the number of occurrences of
Success in a sequence of n expermints. Let us call it K. Thus for example
if n = 4, the value of random varibale at the sequence SFFS equals 2
since for this particular outcome there are two successeexample if n = 4,
the value of random varibale at the sequence SFFS equals 2 since for this
particular outcome there were two successes. Put another way the random
variable K counts the number of ocuurences of S in a sequence of n letters
made up of S and F symbols.
• the pdf of the random variable K is called the binaomail density function
with parameters n and p–denoted usually by b(k; n, p)–is the probability
of exactly k successes in a sequence of n trials. Notice that here the
variable of the function is k and n and p are given fixed parameters. We
saw in the coin tossing experiment how to calculate this function for n = 3
and p = 1/2 (except that there, Success was called head, Failure was called
tail.) But you should appreicate that for even moderately large n the
essentially tabulating approach is not going to be practical. FOrtunatly
there is a simple formula for th binomial density function (and chapter 7
of your tex elaborates on it somewhat more):
n k n−k
n
n!
Pr[K = k|n, p] = b(k; n, p) =
p q
where
=
k!(n − k)!
k
k
and n! = 1 × 2 × · · · × n is the facotrial function.
• For the cumulative distribution function or cdf of this random variable
there is no simple formula, rather we can only express this in the form of
a sumation:
Pr[K ≤ k|n, p] = B(k; n, p) = b(0; n, p) + b(1; n, p) + · · · + b(k − 1; n, p) + b(k; n, p)
It is quite tedius to compute the cdf function of binomial distribution as
given above, especially if n is large, in the order of thousands or more. In a
few lectures we will see that there are quite efficeint ways of approximating
the biomial distribution by easier to compoute functions.
• Now what is the mean and variance of a Binomial distribution? It turns
out that
µK = np
σ2 = npq
6