Download Statistics Special Discrete Distributions Unit Plan

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Transcript
STATISTICS SPECIAL DISCRETE DISTRIBUTIONS UNIT PLAN
In the last unit we looked at the general form of a discrete probability distribution. What it is, how to
find its measures, etc. In this section we will look at four special discrete distributions that show up in a
number of applications. We are already quite familiar with one of these, the “C over C” or
“Hypergeometric” distribution. The other three will be new to us.
The “good news” about these distributions is that they all have special formulas for their mean and
variance. That is, we will no longer need the 𝑥 ∙ 𝑃(𝑥) and 𝑑2 ∙ 𝑃(𝑥) columns! This will save us a good
deal of work.
The main challenge in this unit will be to look at a situation and determine which of the four special
distributions applies to it.
The four distributions we will be studying are:
1.) Binomial
2.) Hypergeometric
3.) Geometric
4.) Poisson
The first two are finite, and the last two are infinite. Infinite distributions sound scary, but we can tame
them with the complement rule.
We are also going to replace our old notation 𝑃(𝐸) for the probability of an event and 𝑃(𝐸 ′ ) for the
probability of its complement. Instead we will just use 𝑝 and 𝑞 respectively for these purposes. The
formulas will get quite clunky otherwise.
I. Binomial Distributions
A distribution is called binomial if and only if:
1.) Given a sample space Ω and a random variable 𝑥 constructed over that space, an outcome is in
instance 𝑖 if and only if it is an outcome where a result specified in the definition of 𝑥 occurred
exactly 𝑖 times.
2.) The sample space in question was constructed by drawing with replacement from a single selection
set. (and thus all single-draw events in the sample space are independent of each other.)
The probabilities associated with such a random variable have the following generating function:
𝑷(𝒙) = 𝒑𝒙 𝒒𝒓−𝒙 𝒓𝑪𝒙
Where p is the probability of any given draw resulting in an outcome in the Event in question, and q
being the complement of p. (𝑞 = 1 − 𝑝).
This generating function is called the Binomial Theorem.
The special formulas for mean and variance are: 𝝁𝒙 = 𝒓𝒑 and 𝝈𝟐 = 𝒓𝒑𝒒
[Recall to get standard deviation you must take the square root of variance.]
Warning: A lot of textbooks use n where I am using r, but I want to reserve n for the number of items in
the selection set in this context.
Where does the “C” come from? Something that confuses a lot of people about the Binomial Theorem
is that, even though it is constructed on a sample space with replacement, it somehow involves
combinations, which have to do with drawing without replacement. We will examine this confusing fact
in the context of our first example.
Example 1: Let’s take the vacation in Paris example from the previous section. I said since we didn’t
know the Binomial Theorem, we were limited to analyzing a two-day vacation. But now, let’s re-write
the problem so our vacation lasts five days!
40% of all days in Paris have some sunshine. Each day’s weather is independent of the weather on any
other day. Suppose you are now taking a five-day vacation to Paris. Let x = number of days during your
vacation in which there is some sunshine. Do the following
a.) Specify the probabilities of all instances of x.
b.) Find the expected number of days with sunshine during your vacation, and the associated standard
deviation.
First of all, is x binomial? Remember, it has to meet our two criteria.
Does it meet criterion #1? Yes, because a result is specified (a day with sunshine), and in the sample
space of possible vacations, a vacation is in instance x only if had sunshine that number of days.
Does it meet criterion #2? Yes, the weather on any given day is independent of the weather on any
other given day. This is equivalent to constructing a sample space with replacement.
Now we can answer part a.) of the example by making a table
using the generating function, as shown to the right.
𝑥
𝑃(𝑥) = 0.4𝑥 0.65−𝑥 5𝐶𝑥
0
𝑃(0) = 0.40 0.65 5𝐶0 = 0.0778
1
𝑃(1) = 0.41 0.64 5𝐶1 = 0.2592
2
𝑃(2) = 0.42 0.63 5𝐶2 = 0.3456
3
𝑃(3) = 0.43 0.62 5𝐶3 = 0.2304
4
𝑃(4) = 0.44 0.61 5𝐶4 = 0.0768
And standard deviation then is √1.2 = 1.0954
5
𝑃(5) = 0.45 0.60 5𝐶5 = 0.0102
So now that we’re done with that, on to the question everyone’s
asking: “Where does that C come from??”
∑
1
To answer part b.), we’re used to extending this table, adding two
more columns, one to find mean (expected value), and one to find
variance.
However, since x is binomial, there are known formulas which do
the work for us! No need for more columns!!
𝜇𝑥 = 𝑟𝑝 = 5 ∙ 0.4 = 2
𝜎 2 = 𝑟𝑝𝑞 = 5 ∙ 0.4 ∙ 0.6 = 1.2
Let’s consider instance 𝑥 = 1 : What outcomes are in this instance? All of the outcomes where one day
has sunshine and the rest do not. Let’s call these results 𝑠 and 𝑛. How many outcomes are in this
instance? Let’s list them:
{𝑠1 , 𝑛2 , 𝑛3, 𝑛4, 𝑛5 }
{𝑛1 , 𝑠2 , 𝑛3, 𝑛4, 𝑛5 }
{𝑛1 , 𝑛2 , 𝑠3, 𝑛4, 𝑛5 }
{𝑛1 , 𝑛2 , 𝑛3, 𝑠4, 𝑛5 } and {𝑛1 , 𝑛2 , 𝑛3, 𝑛4, 𝑠5 } .
As you can see, there are five outcomes in instance 𝑥 = 1. The probability of each one occurring is
(0.4)1 (0.6)4 . And, not coincidentally, 5𝐶1 = 5. Therefore, the total probability of instance 𝑥 = 1 is
0.41 0.64 5𝐶1 = 0.2592.
You can repeat this exercise for the remaining instances if you want. Each time, the number of outcomes
in each instance 𝑥 will equal 5𝐶𝑥 .
Why does this happen? Because the way x was defined, I just wanted x days with sunshine. I didn’t
specify that I wanted the first x day to have sunshine and the remainder not to have sunshine. So, in
other words, even though the sample space was constructed with replacement, in my definition of x, I
don’t care about order. (That is, the instances themselves were constructed without replacement – if I
want exactly two days with sunshine, once those days have been given to specific draws, the remaining
three draws must be days without sunshine.)
Another point of confusion here is that since 𝑛𝑃𝑟 ≥ 𝑛𝐶𝑟 , students think if I specified that order does
matter in my definition of x, then the probabilities of each instance would increase. This isn’t true –
rather, if I specified that order did matter, there would no longer be just 5 instances of x, but rather 32
[see appendix 1 for the list]. To see this, realize that each of the five outcomes that were in instance x =
1 above would be in separate instances.
You also might see the logical contradiction in such a statement – you can’t increase the probability of
all instances without the sum exceeding 1, which isn’t allowed.
Appendix 1
The 32 instances of x if order did matter:
{𝑛1 , 𝑛2 , 𝑛3, 𝑛4, 𝑛5 }
{𝑛1 , 𝑛2 , 𝑠3, 𝑠4, 𝑛5 }
{𝑛1 , 𝑠2 , 𝑠3, 𝑠4, 𝑠5 }
{𝑠1 , 𝑛2 , 𝑛3, 𝑛4, 𝑛5 }
{𝑛1 , 𝑛2 , 𝑠3, 𝑛4, 𝑠5 }
{𝑠1 , 𝑛2 , 𝑠3, 𝑠4, 𝑠5 }
{𝑛1 , 𝑠2 , 𝑛3, 𝑛4, 𝑛5 }
{𝑛1 , 𝑛2 , 𝑛3, 𝑠4, 𝑠5 }
{𝑠1 , 𝑠2 , 𝑛3, 𝑠4, 𝑠5 }
{𝑛1 , 𝑛2 , 𝑠3, 𝑛4, 𝑛5 }
{𝑛1 , 𝑛2 , 𝑠3, 𝑠4, 𝑠5 }
{𝑠1 , 𝑠2 , 𝑠3, 𝑛4, 𝑠5 }
{𝑛1 , 𝑛2 , 𝑛3, 𝑠4, 𝑛5 }
{𝑛1 , 𝑠2 , 𝑛3, 𝑠4, 𝑠5 }
{𝑠1 , 𝑠2 , 𝑠3, 𝑠4, 𝑛5 }
{𝑛1 , 𝑛2 , 𝑛3, 𝑛4, 𝑠5 }
{𝑛1 , 𝑠2 , 𝑠3, 𝑛4, 𝑠5 }
{𝑠1 , 𝑠2 , 𝑠3, 𝑠4, 𝑠5 }
{𝑠1 , 𝑠2 , 𝑛3, 𝑛4, 𝑛5 }
{𝑛1 , 𝑠2 , 𝑠3, 𝑠4, 𝑛5 }
{𝑠1 , 𝑛2 , 𝑠3, 𝑛4, 𝑛5 }
{𝑠1 , 𝑛2 , 𝑛3, 𝑠4, 𝑠5 }
{𝑠1 , 𝑛2 , 𝑛3, 𝑠4, 𝑛5 }
{𝑠1 , 𝑛2 , 𝑠3, 𝑛4, 𝑠5 }
{𝑠1 , 𝑛2 , 𝑛3, 𝑛4, 𝑠5 }
{𝑠1 , 𝑛2 , 𝑠3, 𝑠4, 𝑛5 }
{𝑛1 , 𝑠2 , 𝑠3, 𝑛4, 𝑛5 }
{𝑠1 , 𝑠2 , 𝑛3, 𝑛4, 𝑠5 }
{𝑛1 , 𝑠2 , 𝑛3, 𝑠4, 𝑛5 }
{𝑠1 , 𝑠2 , 𝑛3, 𝑠4, 𝑛5 }
{𝑛1 , 𝑠2 , 𝑛3, 𝑛4, 𝑠5 }
{𝑠1 , 𝑠2 , 𝑠3, 𝑛4, 𝑛5 }
A further question arises – are the probabilities of these 32 instances found by a generating function
that includes multiplying by a Permutation factor? The answer is no. The reason is, each of these 32
instances represents a particular Permutation, so there’s no need to multiply by the number of
Permutations in each instance, since each instance contains only one Permutation.
(You might also be able to see that there’s no sensible input for the second input in the Permutation
expression, because x is no longer a random variable as such, but rather a random set of variables!)
Since one of the requirements for the Binomial Theorem is that x be a random variable, we can now
conclude confidently that any formulation of x where order does matter disqualifies the distribution
from being Binomial!
By the way, the generating function for such a distribution would be 𝑃({𝑥1 , 𝑥2 , 𝑥3, 𝑥4, 𝑥5 }) = 𝑝|𝑠| 𝑞 |𝑛| ,
where |𝑠| and |𝑛| are the number of times 𝑠 and 𝑛 occur in the input set, respectively.
Obviously we are now very far afield from the Binomial Theorem, so hopefully you see why the Theorem
has the form that it does and not one of the other forms that some students think that it should have!
HW: p. 194-195 #9 – 17; #19 – 22 part a’s only
2.) Hypergeometric Distribution – We are already familiar with these, these are the “C over C” problems
we have been working with for some time.
A distribution is considered Hypergeometric if and only if:
1.) Given a sample space Ω and a random variable 𝑥 constructed over that space, an outcome is in
instance 𝑖 if and only if it is an outcome where a result specified in the definition of 𝑥 occurred
exactly 𝑖 times.
2.) The sample space in question was constructed by drawing without replacement from a single
selection set.
The probabilities associated with such a random variable have the following generating function:
𝑃(𝑥) =
𝑛𝑘 𝐶𝑥 ∙ (𝑛−𝑛𝑘 )𝐶(𝑟−𝑥)
𝑛𝐶𝑟
Where 𝑛𝑘 is the number of items in a given category in the selection set.
And the formulas for mean and variance are:
𝜇=
𝑟∙𝑛𝑘
𝑛
and
𝜎2 =
𝑟∙𝑛𝑘 ∙(𝑛−𝑛𝑘 )(𝑛−𝑟)
(𝑛3 −𝑛2 )
Ex: Let’s revisit the “Lobster and Crabs” problem from the last section. In that
problem, we had 13 crustaceans in a trap. 8 were lobsters and the rest (5) were
giant crabs. We defined a random variable 𝑥 to be the number of lobsters
selected when three crustaceans were pulled from this trap at random without
replacement.
That is, 𝑛 = 13 𝑛𝑘 = 8 and 𝑟 = 3 . Let’s reconstruct the distribution using
the formal generating function and see if we don’t get the same result:
𝑥
𝑃(𝑥) =
0
1
2
3
∑
0.0350
0.2797
0.4895
0.1958
1 (check)
8𝐶𝑥 ∙ 5𝐶(3−𝑥)
13𝐶3
Now before when we did this problem, we laboriously constructed mean and variance using separate
columns. Now let’s see if the special formulas don’t serve us better:
𝜇 = 3 ∙ 8⁄24 = 1.8462
and 𝜎 2 =
3 ∙ 8 ∙ (13 − 8)(13 − 3)
⁄(133 − 132 ) = 0.5917
and then to get standard deviation we would take the square root 0.5917 of that to get 0.7692 .
Check back to Worksheet #4 from the previous unit. The numbers match!
HW: p. 204 #24 – plus find the mean and standard deviation of 𝑥 , where 𝑥 is the number of
defective microchips.
Interlude: Finite Sums of Infinite Sequences
So far we have been dealing with finite sample spaces – they may have been very large, in the millions
or even billions of outcomes, but they were finite. In this section we will examine non-finite sample
spaces and random variables constructed on them.
Even though these spaces are non-finite, their total Probability still = 1. How is this possible? Consider
the following sequence.
𝑆=
1 1 1 1
1
+ + +
+ …
2 4 8 16 32
S is a sequence that is unbounded, with the nth term given by
∞
𝑆=∑
𝑛=1
1
2𝑛
. We could therefore write
1
2𝑛
I will now claim something quite surprising: 𝑆 = 1 . That’s right: The sum of an unbounded sequence
is a finite number. Indeed, it is 1, so we can consider each n to be the name of an outcome in a sample
space, with the corresponding term in S being its probability!
1
How do I prove such a surprising claim? It is actually quite simple. Consider 2 𝑆. By the good old
Distribution Rule from Algebra,
1
1 1 1
1
1
𝑆= + +
+
+
+ ⋯.
2
4 8 16 32 64
1
2
Now let’s consider what happens when we subtract 𝑆 from 𝑆…
𝑆
1 1 1 1
1
1
+ + +
+
+
+⋯
2 4 8 16 32 64
1
− 𝑆
2
−1 1 1
1
1
− −
−
−
4
8 16 32 64
1
1
As you can see, all but the first term cancels. Since we know that 𝑆 − 2 𝑆 = 2 𝑆, we can conclude from
this that
1
1
𝑆=
2
2
And therefore:
𝑆=1
So now that you believe me that such sample spaces can exist, let’s look at two of the more common
distributions that can be constructed over such sets: The Geometric and Poisson distributions.
3.) Geometric Distributions
A random variable x is said to be Geometric if and only if:
1.) It is constructed over an unbounded sample space (this means an unbounded number of draws!),
where all single-draw events are independent of each other.
2.) Each instance of x represents the set of outcomes in which the result specified in x occurred for the
first time on the xth draw.
An important implication of 2.) is that the Geometric Distribution has no zero row. This is because a
result can’t occur for the first time on the “0th” trial. That makes no sense.
The generating function for the Geometric is:
𝑃(𝑥) = 𝑝 ∙ 𝑞 𝑥−1
Where p and q are defined as they were for the binomial distribution.
Also like the binomial, Geometric distributions have known formulae for Expected Value and Variance:
𝐸(𝑥) =
1
𝑞
and 𝜎𝑥 2 = 2
𝑝
𝑝
So there’s no need for the “big tables” that started Chapter 4.
Examples: p. 198 Example #1. p. 199 Try it Yourself #1
HW: p. 202 #1-4; p. 203 #15-18; p. 204 #25,26
4.) Poisson Distributions
A random variable x is said to be Poisson if and only if:
1.) Each instance of x represents the number of occurrences that take place within a given unit of
measure. (Usually a unit of time and/or space), and
2.) These occurrences appear randomly, but with a known mean number of occurrences 𝜇 for the given
unit.
Obviously Poisson Distributions are very different from the kinds we have studied before. Also, unlike
the Binomial and Geometric Distributions, whose generating functions were pretty obvious once you
saw them derived, the generating function for the Poisson is impossible to derive without knowledge of
some advanced Calculus. Therefore, this next formula you will have to take “on faith” – or study Calc for
a few years, whichever you prefer 
The generating function is:
𝑃(𝑥) =
𝜇 𝑥 𝑒 −𝜇
𝑥!
Where “e” is the mathematical constant. You can find a button for “e” on most calculators. On the
standard blue calculators it is accessed by pressing [2nd] and then the [ln] on the upper left-hand side.
Notice that this button automatically opens a power for you, which is nice.
Like Binomial and Geometric, Poisson Distributions also have known formulas for Expected Value and
Variance:
𝐸(𝑥) = 𝜇 and 𝜎 2 = 𝜇
This odd result comes from the strange properties of “e,” which could fill a whole lesson in and of
themselves!
Examples: p. 199-200 Example #2, p. 200 Try It Yourself #2, 3
HW: p.202 #5-8, p. 203 #19-22, p. 204 #27-28
Distinguishing Between Distributions
If you ever hope to use Statistics in a career, you have to learn how to distinguish between different
kinds of distributions. In the following questions, try to determine if the distribution in question is
Binomial, Geometric or Poisson.
Exercises: p. 202 #9-14
Follow with SPD Worksheets, SPD Test