Download 4. Random Variables, Bernoulli, Binomial, Hypergeometric

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Indeterminism wikipedia , lookup

Inductive probability wikipedia , lookup

Birthday problem wikipedia , lookup

Probability box wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Random variable wikipedia , lookup

Probability interpretations wikipedia , lookup

Randomness wikipedia , lookup

Conditioning (probability) wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
STA111 - Lecture 4
Random Variables, Bernoulli, Binomial, Hypergeometric
1
Introduction to Random Variables
Random variables are functions that map elements in the sample space to numbers (technically, random
variables have to satisfy some mathy conditions, but we won’t worry about that here; if you’re really interested, you can take STA711 in a few years). Today we will work with random variables that can take on
values on countable subsets of R. Later in the course (probably next time) we will work with “continuous”
random variables that take on values on (dense) subsets of R. As always, we’ll try to digest the new concept
with some examples.
Examples:
• Suppose we’re flipping a coin twice. An example of a random variable would be “number of tails”,
which we will denote X1 . This is a table summarizing the values that X1 take on:
outcome
heads, heads
heads, tails
tails, heads
tails, tails
X1
0
1
1
2
For example, we can compute probabilities of the type P (X1 = 1) = P ({heads, tails}∪{tails, heads}) =
1/2, or P (X1 > 0) = 1 − P (X1 = 0) = 3/4. As you can see, working with random variables is intuitive
and natural, and it’s pretty convenient notation.
• Suppose we’re rolling a die twice. An example of a random variable here would be adding up the
outcomes. If we call it X2 , a table with some values is:
outcome
1,1
1,2
2,1
2,2
..
.
X2
2
3
3
4
..
.
And we can find probabilities like P (X2 = 3) = 2/36 or P (2 < X2 ≤ 4) = P (X2 = 3) + P (X2 = 4) =
2/36 + 3/36.
Exercise 1. Come up with 3 examples of random variables and give an example of a probability using random
variable notation for each of them (i.e. of the type P (X = k), P (X ≤ k), P (X 6= k), etc.).
Now we’re going to spend some time introducing different “types” of random variables that can be used for
modeling random phenomena.
1
1.1
Bernoulli
Suppose X is a random variable that can only take on the values 1 or 0 with probabilities P (X = 1) = p and
P (X = 0) = 1 − p. Then X is said to have a Bernoulli distribution with probability of “success” p, denoted
X ∼ Bernoulli(p).
Examples:
• We’re flipping a coin once and our random variable is X1 = 1 if the outcome is heads and X1 = 0 if
the outcome is tails. Then, X1 ∼ Bernoulli(1/2).
• We’re rolling a die and our random variable takes on the value X2 = 1 if the outcome is strictly greater
than 4 and X2 = 0 otherwise. Then, X2 ∼ Bernoulli(1/3).
1.2
Binomial
Suppose we repeat a “Bernoulli experiment” n times independently and we add up the outcomes. That is,
suppose that our random variable is Y = X1 + X2 + · · · + Xn , where Xi ∼ Bernoulli(p) and the Xi are
independent. Then Y is said to have a Binomial distribution with sample size n and probability of success
p, denoted Y ∼ Binomial(n, p).
Examples:
• We’re flipping a fair coin 4 times and we want to count the total number of tails. The coin flips
(X1 , X2 , X3 , and X4 ) are Bernoulli(1/2) random variables and they are independent by assumption, so
the total number of tails is Y = X1 + X2 + X3 + X4 ∼ Binomial(4, 1/2).
• You’re taking a multiple choice test with 10 questions and 3 answers per question. For each question,
there’s only one correct answer. You haven’t studied for the test and you decide to choose the answers
“at random”, so you have a 1/3 chance of getting each question right. Let Xi = 1 if your answer
to i -th question is right, so Xi ∼ Bernoulli(1/3). The total number of right answers in your test is
Y = X1 + X2 + · · · + X10 ∼ Binomial(10, 1/3).
• Suppose that we flip a fair coin. The random variable X1 equals 1 if it comes up heads and X1 = 0 if
it comes up tails (so X1 ∼ Bernoulli(1/2)). If X1 = 1, we will use a loaded coin with a probability of
coming up heads equal to 2/3 for our next flip (X2 ). If X1 = 0, we will use a fair coin for X2 . The
random variable X2 is also Bernoulli, since it can only take on the values 0 or 1. The probability of
success is
P (X2 = 1) = P (X1 = 0)P (X2 = 1 | X1 = 0) + P (X1 = 1)P (X2 = 1 | X1 = 1)
1 1 1 2
7
= · + · =
≈ 0.583.
2 2 2 3
12
So X1 ∼ Bernoulli(1/2) and X2 ∼ Bernoulli(7/12). Is Y = X1 + X2 a Binomial? The answer is no
because 1) the probabilities of success for X1 and X2 are different 2) X1 and X2 are not independent!
The coin we flip in X2 depends on the outcome of X1 , so X1 and X2 are clearly dependent.
Let Y ∼ Binomial(n, p). What is P (Y = k)? (for k ∈ {0, 1, 2, ... n}). Let’s start with a simple example
where n = 4. Note that we can identify the outcomes of the underlying Bernoulli experiments X1 , X2 , X3 , X4
with the strings 0000, 0010, 0100, 1110, etc. (for example, 0010 means that all the Xi except X3 are zero).
Then,
2
• P (Y = 0) = P (0000) = (1 − p)4 .
• P (Y = 1) = P (1000) + P (0100) + P (0010) + P (0001) = 4p(1 − p)3 .
• P (Y = 2) = P (0011) + P (1100) + P (1010) + P (0101) + P (1001) + P (0110) = 6p 2 (1 − p)2 .
• P (Y = 3) = P (0111) + P (1011) + P (1101) + (1110) = 4p 3 (1 − p).
• P (Y = 4) = P (1111) = p 4 .
We can definitely see a pattern here. In general, if Y ∼ Binomial(n, p),
P (Y = k) = constant × p k (1 − p)n−k .
Again, we can identify the event Y = k with strings of k ones and n − k zeros. They’re all mutually exclusive
(disjoint) events so the probability that Y = k happens is the sum of the probabilities that each of the
favorable strings happens. Given our independence assumption, all the favorable strings are equally likely
with probability p k (1 − p)n−k . The constant that we need is the number of favorable strings, which is the
number of strings of that contain k ones and n − k zeros. If we pick the position of the ones, we’re done (the
rest must be zero). For example, in the case where n = 4 and k = 2 (see above), the favorable strings can
be identified with the sets {3, 4}, {1, 2}, {1, 3}, {2, 4}, {1, 4}, {2, 3}. Therefore, the constant is the number
of subsets of k elements out of a set with n elements in total, and order doesn’t matter because we’re just
picking where the ones are.
Thus, we have that if Y ∼ Binomial(n, p),
n k
P (Y = k) =
p (1 − p)n−k .
k
From now on you can just use this formula. I don’t expect you to rederive it, but I would like you to understand where the expression comes from.
Examples:
• Suppose we’re flipping a fair coin 4 times and we want to count the total number of tails, which we
denote Y . What is the probability that we get 2 tails? We have that Y ∼ Binomial(4, 1/2), so
4
P (Y = 2) =
(1/2)2 (1/2)2 = 0.375.
2
• You’re taking a multiple choice test with 10 questions and 3 different answers per question. For each
question, there’s only one correct answer. You haven’t studied for the test and you decide to choose
the answers “at random”, so you have a 1/3 chance of getting each question right. What is the
probability that you get at least half of them right? Let Y be the total number of right answers in
your test. Then, Y ∼ Binomial(10, 1/3). We’re interested in finding P (Y ≥ 5), which is equal to
P (Y ≥ 5) = P (Y = 5) + P (Y = 6) + · · · + P (Y = 10)
10
10
5
5
=
(1/3) (2/3) +
(1/3)6 (2/3)4 + · · · + (1/3)10 ≈ 0.213
5
6
Exercise 2. All students enrolled in STA111 (16 students) have to take a medical test which has probability
0.1 of giving a false positive. Suppose that you’re all healthy. What is the probability that there is at least
one false positive?
3
Exercise 3. Suppose you roll a fair die 6 times. What is the probability that you get a number strictly
greater than 4 at least twice? Our best friend Bobby is willing to bet $10 that it won’t happen. Would you
bet against him?
Exercise 4. Give 3 examples of Binomial random variables, and compute one probability for each of them.
1.3
Hypergeometric
Suppose we have a population of N elements where M elements have a certain characteristic and N − M
don’t. Suppose that we select n elements of the population without replacement. If X is the number of
elements in the sample that have the caracteristic, then:
M N−M
P (X = k) =
k
n−k
N
n
,
and X is said to have a Hypergeometric distribution, denoted X ∼ Hypergeometric(N, M, n). We’ve seen
this type of random variable before in examples and homeworks!
Example:
• Remember Exercise 2 in HW2?
A bag has 3 green jelly beans and 7 red jelly beans. If you extract 2 jelly beans, what is the
probability that the 2 of them are red? Now suppose that you draw 5 jelly beans out of the
bag. What is the probability that 3 are red and 2 are green?
This is an example of a Hypergeometric random variable. The characteristic is “being red”. The population is the jelly beans in the bag, so N = 10. There are 7 red jelly beans, so M = 7 and N −M = 3. For
part 1 we sample 2 elements of the population, so n = 2 and X1 ∼ Hypergeometric(10, 7, 2) and we
want to compute P (X1 = 2). For part 2 of the question we have n = 5, X2 ∼ Hypergeometric(10, 7, 5)
and we’re interested in P (X2 = 3)
Exercise 5. Suppose that you have 20 really good friends, 10 of which like Broccoli. You want to host a
dinner party, but your apartment is too small and can only fit 5 friends. You’re a nice person, so you decide
that the right thing to do is selecting 5 of them at random. What is the probability that all of your randomly
selected guests like Broccoli? What is the probability that at least one of them doesn’t?
Now you might be a little bit confused... What is the difference between Binomial and Hypergeometric? The
key is “without replacement”. If you’re in a Hypergeometric scenario, your draws are not independent. In
the jelly beans example, the probability that the second jelly bean is red depends on the outcome of the first
draw, so the draws are not independent (and recall that independence is an assumption of the Binomial!).
Exercise 6. Give two examples of random variables with a Hypergeometric distribution.
If the size of the population N is big relative to the sample size n, the Binomial and the Hypergeometric
give similar answers. Can you see why?
4